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Preface 


This book summarizes our research work conducted in the University 
of Colorado at Boulder from 2000 to 2004 while the first author was pur- 
suing his Ph.D. degree in the Department of Electrical and Computer 
Engineering. Our research addresses the problem of applying automatic 
abstraction refinement to the model checking of large scale digital sys- 
tems. 

Model checking is a formal method for proving that a finite state tran- 
sition system satisfies a user-defined specification. The primary obstacle 
to its widespread application is the capacity problem: State-of-the-art 
model checkers cannot directly handle most industrial-scale designs. Ab- 
straction refinement—-an iterative process of synthesizing a simplified 
model to help verify the original model—is a promising solution to the 
capacity problem. In this book, several fully automatic abstraction re- 
finement techniques are proposed to efficiently reach or come close to 
the simplest abstraction. 

First, a fine-grain abstraction approach is proposed to keep the ab- 
straction granularity small. With the advantage of including only the 
relevant information, the fine-grain abstraction is proved to be indispens- 
able in verifying systems with complex combinational logic. A scalable 
game-based refinement algorithm called GRAB is proposed to identify the 
refinement variables based on the systematic analysis of all the short- 
est counterexamples. Compared to methods in which each refinement 
is guided by a single counterexample, this algorithm often produces a 
smaller abstract model that can prove or refute the same property. 

Second, a compositional SCC analysis algorithm called DNC is pro- 
posed in the context of LTL model checking to quickly identify unim- 
portant parts of the state space in previous abstractions and prune them 
away before verification is applied to the next abstraction level. With a 
speed-up of up to two orders of magnitude over standard symbolic fair 
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cycle detection algorithms, DnC demonstrates the importance of reusing 
information learned from previous abstraction levels to help verification 
at the current level. 

Finally, BDD based symbolic image computation and Boolean sat- 
isfiability check are revisited in the context of abstraction refinement. 
We propose two new algorithms in order to improve the computational 
efficiency of BDD based symbolic fixpoint computation and SAT based 
bounded model checking, by applying the idea of abstraction and suc- 
cessive refinements inside the two basic decision procedures. 

Analytical and experimental studies demonstrate that the fully auto- 
matic abstraction refinement techniques proposed in this book are the 
key to applying model checking to large systems. The suite of fully au- 
tomatic abstraction refinement algorithms has demonstrated significant 
practical importance. Some of these BDD and SAT based algorithms 
have been adopted by various commercial/in-house verification tools in 
industry. 

The Ph.D. dissertation upon which this book is based won the 2003- 
2004 ACM outstanding Ph.D. dissertation award in electronic design 
automation. This ACM award, established by ACM SIGDA, is given 
each year to an outstanding Ph.D. dissertation that makes the most 
substantial contribution to the theory and/or application in the field of 
electronic design automation. 


CHAO WANG, GARY D. HACHTEL, FABIO SOMENZI 
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Chapter 1 


INTRODUCTION 


Our society is increasingly dependent, on various electronic and com- 
puter systems. These systems are used in consumer electronics, auto- 
mobiles, medical devices, traffic controllers, avionics, space programs, 
etc.. Many of these systems can be classified as critical systems—safety- 
critical, mission-critical, or cost-critical. Design errors in these critical 
systems are generally intolerable, since they either cost a lot of money, 
or cost lives. However, designing a flawless computer system is becoming 
harder as the size of new systems gets larger. In the hardware design 
community, for instance, functional verification has been identified as the 
bottleneck in the entire design process. According to ITRS (the Inter- 
national Technology Roadmap for Semiconductors [ITR03]), two thirds 
of a typical ASIC design budget go into verification, and verification 
engineers frequently outnumber design engineers in large project teams. 
Still, over 6096 of the IC designs require a second “spin” due to logic 
and functional level errors. Similar problems also exist in the software 
community, especially in the design and implementation of embedded 
and safety-related software systems (device drivers, air traffic control 
systems, security protocols, etc.). The vast majority of verification ex- 
perts believe that formal analysis methods are indispensable in coping 
with this “verification crisis." 

Traditional verification techniques are simulation and testing. Simu- 
lation is applied to a model of the product, while testing is applied to 
the product itself. The basic idea of simulation and testing is feeding 
in some test vectors and then checking the output for correctness. The 
disadvantage of this “trial-and-error” based approach is that all the pos- 
sible input conditions must be checked in order to make sure the design 
is correct. However, even for pure combinational circuits, it is infeasi- 
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ble to enumerate all the possible input conditions except for very small 
designs. For sequential circuits, there can be an infinite number of in- 
put conditions due to the possibly unbounded number of time instances. 
Therefore, although simulation and testing are very useful in detecting 
"bugs" in the early stages of the design process, they are not suitable 
for certifying that the design meets the specification. 


To get a mathematical proof that the design satisfies a given spec- 
ification under all possible input conditions, one needs formal verifi- 
cation techniques. Model checking and theorem proving are two rep- 
resentatives of the existing formal verification techniques. Given a fi- 
nite state model and a property expressed in a temporal logic, a model 
checker can construct a formal proof when the model satisfies the prop- 
erty [CE81, QS81]. If the property fails, the model checker can show 
how it fails by generating à counterexample trace. Model checking is 
fully automatic in the sense that the construction of proof or refutation 
does not require the user's intervention. This is in contrast to the formal 
techniques based on theorem proving, which rely on the user's expertise 
in logics and deductive proof systems to complete the verification. 

Model checking has been regarded as a potential solution to the “ver- 
ification crisis" in the computer hardware design community. It is show- 
ing promise for many other applications as well, including real-time sys- 
tem verification [AHH96], parameterized system verification [EK03], and 
software verification [VB00, BMMR01, MPC*02]. 


The primary obstacle to the widespread application of model check- 
ing to real-world designs is the capacity problem. Since model checking 
uses an exhaustive search of the state space of the model to determine 
whether a specification is true or false, the complexity of model check- 
ing depends on the number of states of the model as well as the length 
of the specification. Due to its exponential dependence on the num- 
ber of state variables or memory elements, the number of states of the 
model can be extremely large even for a moderate-size model. This is 
known as the state explosion problem. À major breakthrough in dealing 
with state explosion was symbolic model checking [BCM*90, McM94] 
based on Binary Decision Diagrams (BDDs [Bry86]). However, even 
with these symbolic techniques, the capacity of model checking remains 
limited: The state-of-the-art model checkers still cannot directly han- 
dle most industry-scale designs. In fact, symbolic model checkers often 
lose their robustness when the model has more than 200 binary state 
variables; at the same time, hardware systems become more and more 
complex because of Moore's law and the increasing use of high level hard- 
ware description languages (HDLs)—models with thousands or tens of 
thousands of state variables may yet look modest. 
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1.1 Background 


Abstraction is an important technique to bridge the capacity gap be- 
tween the model checker and large digital systems. When a system 
cannot be directly handled by the model checker, abstraction can be 
used to remove information that is irrelevant to the verification of the 
given property. We then build an abstract model which hopefully is 
much simpler and apply model checking to it. In doing so, an abstract 
interpretation [CC77], or a relation between the abstract system and 
the concrete system is created. For it to be useful in model checking, 
abstraction must preserve or at least partially preserve the property to 
be verified. There exist automatic abstraction techniques that preserve 
a certain class of temporal logic properties. For instance, bi-simulation 
based reduction [Mil71, DHWT91] preserves the entire propositional p- 
calculus. However, property-preserving abstractions are either very hard 
to compute or do not achieve a drastic reduction [FV99], and therefore 
are less attractive in practice. À more practical approach is called prop- 
erty driven abstraction, which preserves or partially preserves only the 
property at hand. Along this line, Balarin et al. [BSV93], Long [Lon93], 
and Cho et al. [CHM*96a| have studied various ways of deriving an 
abstract model from the concrete system for model checking. 

Abstraction refinement was introduced by Kurshan [Kur94] in the 
context of model checking linear properties specified as w-regular au- 
tomata. In this paradigm, verification is viewed as an iterative process 
of synthesizing a simplified model that is sufficient to prove or refute the 
given property. In COSPAN [HHK96], the initial abstraction contains 
only the state variables in the property and leaves the other variables 
unconstrained. Since unconstrained variables can take arbitrary values, 
the abstract model is an over-approzimation in the sense that it contains 
all possible execution traces of the original model, and possibly more. 
Therefore, when a linear time property holds in the abstract model, it 
also holds in the concrete model; when the property fails in the abstract 
model, however, the result is inconclusive. In the case of inconclusive 
result, the abstract model is refined by adding back some relevant but 
previously unconstrained variables. The key issue in abstraction refine- 
ment is to identify in advance which variable is relevant and which is 
not. Note that an over-approximated abstraction is applicable not only 
to linear properties specified as w-regular automata, but also to other 
universal properties including LTL [Pnu77] and ACTL [CE81, EH83], 
because over-approximation suffices to prove these properties true. 

For practical reasons, it is important to keep the abstraction refine- 
ment process fully automatic. Manual abstraction can be very powerful 
when it is carried out carefully by experienced users. However, it often 
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requires a significant amount of user's intervention and in-depth knowl- 
edge of the design. In fact, manual abstraction is very labor intensive 
and can be error-prone even for skilled users, making it hard for ver- 
ification to keep up with the design schedule in an industrial setting. 
Therefore, fully automated abstraction techniques are far more attrac- 
tive in practice. In abstraction refinement, a procedure typically starts 
with a coarse initial abstraction and then automatically augments the 
abstract model by iterative refinement. 


The main challenge in abstraction refinement is related to the ability 
of generating a final abstract model that is as simple as possible. The 
final abstraction, or deciding abstraction, is the one that decides the 
truth of the property to be verified. One can always start with a very 
coarse initial abstraction and keep refining it until the abstraction be- 
comes deciding. Therefore, the effectiveness of the refinement algorithm 
is critical in keeping the final abstract model small. Existing refinement 
algorithms can be classified into the following categories. Some refine- 
ment algorithms rely on information about the structure of the model, 
e.g., the pair-wise latch relation [LPJ*96] or the variable dependency 
graph [LNA99]. Some refinement algorithms rely on the analysis of the 
set of approximate satisfying states of the given property produced in a 
previous model checking run, e.g., the operation-based refinement meth- 
ods [PH98, JMHO00]|. Some refinement algorithms are driven by spuri- 
ous abstraction counterexamples produced in a previous model check- 
ing run [CGJ*00, WHL*01, CGKS02, CCK*02, GKMH*03, MH04J; in 
these methods, the goal of refinement is to remove the abstract coun- 
terexamples that do not correspond to any real path in the concrete 
model. Other refinement algorithms rely on the analysis of unsuccessful 
bounded model checking runs [MA03, LWS03, GGYA03, LS04, LWS05, 
ZPHS05]. In the latter cases, unsatisfiability proofs of these bounded 
model checking instances directly induce abstract models that are suffi- 
cient for disabling all counterexamples of a certain length. 


The simplicity of the final abstract model is bounded ultimately by 
the degree of locality of the given property in the model. In general, 
a high degree of locality is necessary for the success of abstraction re- 
finement. For a property whose proof or refutation relies on detailed 
knowledge of the entire system, it is clear that abstraction refinement is 
ineffective. In practice, however, the properties used in model checking 
are often partial specifications of the system behavior, and user-specified 
properties tend to depend on only part of the system. This is largely due 
to the structured programming or design style adopted by engineers. In 
this case, it is the refinement algorithm's responsibility to exploit fully 
the degree of locality of a given property. To measure the quality of dif- 
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ferent abstraction refinement algorithms, we define an important metric 
called abstraction efficiency as follows: 


final abstract model size 


e(T 
i original model size 


). 

For every pair of model M and property ¢, there exists an optimum or 
maximum abstraction efficiency 7*. Note that ņ* is a property of the 
specific verification problem (M, ó), not a property of the abstraction 
refinement algorithm. As a heuristic principle, the closer to the opti- 
mum value it can achieve, the better a certain abstraction refinement 
algorithm is. 

Another important metric for abstraction refinement is the rate of 
convergence. This characterizes how quickly a refinement algorithm 
converges from the initial abstract model to a deciding abstraction. In 
practice, this can be measured either by the number of refinement itera- 
tions or by the overall run time. We have observed cases for which some 
algorithms converge quickly to à near optimal abstraction while other 
algorithms spend a lot of time searching in vain for such an abstraction. 
In the ideal case, an algorithm should find, at each abstraction refine- 
ment iteration step, a set of refinement variables that is a subset of an 
optimum deciding abstraction. 


1.2 Our Contributions 


This book deals with the main challenge in abstraction refinement, 
i.e., the ability to efficiently reach or come close to the optimum deciding 
abstraction. We propose several fully automatic abstraction techniques 
in order to improve the overall computation efficiency as well as the rate 
of convergence. Together, they address the following three problems that 
are critical in the abstraction refinement loop: 


1 How to make the abstraction more concise? 


2 How to identify and reuse critical information from previous abstrac- 
tion levels? 


3 How to make the basic decision procedures used in abstraction re- 
finement more efficient? 


In order to achieve a higher abstraction efficiency, it is crucial to 
keep the refinement granularity small so that only the relevant infor- 
mation is included in the abstract model. That is, each successive re- 
finement should include only variables that are present in an optimal 
or near-optimal deciding abstraction. In previous work, the abstraction 


6 


granularity is often limited at the state variable level: the entire fan- 
in combinational logic cone of a state variable is either included in or 
completely excluded from the abstract model. However, it is often the 
case that not every one of these fan-in logic gates is necessary for the 
verification of a certain property, even if the state variable itself is in- 
deed necessary. Including these redundant logic gates often significantly 
increases the complexity of the abstract model—an abstract model with 
few state variables may end up containing a large number of logic gates. 


In this book, we propose a fine-grain abstraction approach to push the 
granularity of abstraction beyond the usual state variable level. Boolean 
network variables are selectively inserted into large combinational logic 
cones to partition them into smaller pieces. In the abstraction as well 
as the successive refinements, Boolean network variables are given the 
same status as state variables—both are considered as atoms. With this 
approach, refinement strategies must search a two-dimensional space. 
Refinement in the sequential direction is comprised of the addition of new 
state variables only, which is typical of much of the prior art [LPJ*96, 
JMHO00, CGJ*00, CGKS02]. Refinement in the Boolean direction is 
comprised of the addition of Boolean network variables only, which does 
not increase the number of abstract states but refines the transition 
relation among them. Although cut-set variables that are similar to 
Boolean network variables were used in the previous work of Wang et 
al. [WHL*01] and Glusman et al. [GKMH* 03], these variables were not 
treated the same as state variables during refinement. We shall show that 
by separating the two refinement directions and carefully controlling the 
direction at each iteration step, we can produce refinement variable sets 
that are significantly more concise. 


Spurious counterexamples in an abstract model have been used in 
previous work to compute the set of refinement variables. With the ex- 
ception of [GKMHt 03], the prior art of counterexample based refinement 
relies exclusively on a single counterexample. In practice, however, there 
can be an extremely large number of spurious counterexamples when the 
property fails. In that case, arbitrarily picking up one counterexample 
and use it to drive the refinement is *a-needle-in-the-haystack" approach. 
In this book, we present a way to capture, for invariant properties, all 
the shortest counterexamples using a data structure called the Synchro- 
nous Onion Rings (SORs). A new refinement algorithm, called GRAB, 
is proposed to identify the refinement variables by systematically ana- 
lyzing all the shortest counterexamples. GRAB has two novel features: 
First, it takes à generation of refinement steps to systematically elim- 
inate all spurious counterexamples supported by a given set of SORs. 
Second, each refinement step in the current generation is computed us- 
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ing a scalable game-based strategy that depends solely on the current 
abstract model. Note that being able to compute the refinement without 
using the concrete model is crucial to the scalability of the algorithm, 
since the working assumption is that the concrete model] is large and 
any computation on it is prohibitively expensive. In contrast, previous 
refinement methods in [CGJ*00, CGKS02, CCKT02] do not scale well, 
because they rely on computation in the concrete model. 


Due to the global guidance from the SORs, and the quality and scal- 
ability of the game-based variable selection computation, GRAB demon- 
strates significantly advantages over these previous refinement algorithms— 
it can solve significantly larger problems, requires less memory and less 
CPU time. Although the method in [GKMH 03] is also driven by mul- 
tiple counterexamples, it does not guarantee to capture each and every 
one of the shortest counterexamples. As a result, this refinement method 
is often less accurate than the SOR based refinement and is incapable 
of catching concretizable counterexamples at the earliest possible refine- 
ment step. 

Proof based abstraction methods in [MA03, LWS03, GGYAO03, LS04, 
LWS05, ZPHSO05] capture implicitly all the shortest counterexamples. 
However, these are SAT based methods and rely on a SAT solver to pro- 
duce the unsatisfiability proof of a SAT instance in the concrete model. 
In contrast, our core refinement variable selection algorithm is pure BDD 
based, even though we use SAT as well in concretization test and in pre- 
dicting the refinement direction. We note that a small unsatisfiability 
proof, i.e., the one with à small subset of Boolean variables or clauses, 
does not automatically give a small refinement set [LS04, GGA05b]. 
Both proof-based and counterexample based methods have their own 
advantages and disadvantages. A detailed experimental comparison 
of GRAB with a proof-based refinement algorithm can be found in 
[LWS05], showing that these two methods complement each other on the 
various test cases. Amla et al. [ADK*05] also published results of their 
experimental evaluation of the various SAT based abstraction methods. 
There is also a trend of combining counterexample based methods and 
proof-based methods in abstraction refinement [AM04]. 

In abstraction refinement, we need to model check the abstract model 
repeatedly while it is gradually refined. Information gathered at previous 
abstraction levels can be carried on and be used to speed up the veri- 
fication at the current level. In this book, we propose a compositional 
SCC (Strongly Connected Component) analysis algorithm, called DNC, 
to quickly identify unimportant parts of the state space and prune them 
away before going to the next abstraction. The search state space is also 
disjunctively decomposed into smaller subspaces that can be checked in 
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isolation. Although there exist several symbolic SCC algorithms in the 
prior art [HTKB92, HKSV97, XB00, BGS00, GPP03] and some of them 
have been applied to model checking [RBS00, FFK ^01, SRBO?], these 
methods are not compositional, and would not be effective on the larger 
practical examples studied in this book. In this book, we also prove that 
the strength of an SCC or a set of SCCs decreases monotonically with re- 
finement, which allows the model checking algorithm to tailor the proof 
to the strength of the SCC at hand. The concept of automaton strength 
was due to Kupferman and Vardi [K V98] and Bloem et al. [BRS99]. Al- 
though the strength of the automaton was used in [BRS99] to improve 
LTL model checking, we believe that DNC is the first to systematically 
exploit this important property in the context of abstraction refinement. 


The idea of abstraction followed by the successive refinements is also 
applied to the two basic decision procedures used in model checking: 
BDD based symbolic image computation and Boolean Satisfiability (SAT) 
check. Image computation [CBM89a, GB94] accounts for most of the 
CPU time in BDD based symbolic model checking. The peak sizes 
of the BDDs produced during the computation are essential in deter- 
mining whether or how fast image computation can be completed on 
a given computer. In this book, we propose à novel image computa- 
tion algorithm called FARSIDE image, to reduce the peak BDD size 
inside image computation by minimizing the transition relation with 
over-approximated images as care sets. Exact and approximate reach- 
able states have been widely used to improve image computation since 
the early work of Ranjan et al. [RAB*95] and Moon et al. [MJH* 98]. 
However, BDD minimization was effective only when being applied to 
the near side, or present-state variables of the transition relation. The 
FARSIDE image algorithm is the first to achieve a significant perfor- 
mance gain by applying BDD minimization to the far side, or next-state 
variables of the transition relation. It may seem surprising that signifi- 
cant improvements to the low level BDD work routines can be obtained 
long after the time when BDD methods were a consistent focus in the 
relevant conferences and journals. From our discussion and presented 
results, it should be clear that these improvements are obtained only 
when compositional methods are applied to models that are much larger 
than previously considered. 


Deciding the SAT problem of a Boolean formula is a fundamental com- 
putation in Bounded Model Checking (BMC [BCCZ99]). In BMC, we 
search for counterexamples of a finite length in the given model, and the 
existence of a finite-length counterexample is formulated into a Boolean 
formula that is satisfiable if and only if a counterexample exists. When 
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the Davis-Longeman-Loveland search procedure [DLL62] (implemented 
in many modern SAT solvers) is used to solve the SAT problem, the vari- 
able decision ordering affects the performance significantly. In this book, 
we propose a new algorithm to compute a good variable decision order 
for the series of SAT problems in BMC. The new algorithm exploits the 
fact that the SAT problems in BMC are highly correlated, and therefore 
information learned from previous problems can help solving the current 
problem. The new variable ordering is computed based on the analysis 
of the unsatisfiability proofs of previous SAT instances, and is gradually 
refined as the BMC unrolling depth keeps increasing. Shtrichman also 
studied in [Sht00] the use of static ordering to improve the SAT search 
in BMC. However, his method is based primarily on the unrolled cir- 
cuit structure, and therefore is completely orthogonal to ours. Due to 
the strong correlation among different SAT instances in BMC, applying 
our new decision ordering can significantly reduce the sizes of the SAT 
search trees and therefore improve the overall performance of BMC. 


To summarize, all the new techniques proposed in this book are fully 
automatic and are crucial at improving the performance of abstraction 
refinement. Their application to model checking can significantly in- 
crease the model checker's ability to handle large designs. Our experi- 
mental studies on real-world benchmark circuits indicate that these au- 
tomatic abstraction refinement techniques are the key to applying model 
checking to industrial-scale systems. 


13 Organization of This Book 


This book has nine chapters. Chapter 2 is an introduction to the basic 
concepts and notations commonly used in model checking, including 
finite state models, temporal logics, Büchi automata, symbolic model 
checking, bounded model checking, and abstraction refinement. This 
chapter should be an easy reading for those who are familiar with model 
checking. We have also tried to make the materials easily accessible to 
readers who are in the general area of computer science but not very 
familiar with model checking. From Chapter 3 to Chapter 8, we present 
our main research contributions in details. 


In Chapter 3, we introduce the notion of abstraction granularity and 
present the FINE-GRAIN abstraction approach. We use the simulation 
relation between the abstract and concrete models to explain why model 
checking of the abstract system may be conservative. We present the 
data structure of the SORs to capture all the shortest abstract counterex- 
amples. We show how to use SAT based multi-thread concretization test 
to decide whether the abstract counterexamples are real or not. 
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In Chapter 4, we present the GRAB refinement algorithm for select- 
ing refinement variables based on a two-player reachability game in the 
abstract model. At each refinement iteration, we show how to decide 
the appropriate refinement direction using a SAT check. In both refine- 
ment directions, a greedy generational minimization is used at the end 
to remove redundant refinement variables. Finally, we discuss the use of 
sequential don't cares to constrain the behavior of the abstract model. 

In Chapters 5 and 6, we address the important problem of carrying on 
information from previous abstraction levels to the current level, and ap- 
plying it to speed up model checking. We present a compositional SCC 
analysis algorithm called DNC for model checking LTL properties. In- 
formation learned from previous abstraction levels is used to restrict the 
search for fair cycles at the current abstraction level. We will explain the 
use of SCC strength reduction, disjunctive state space decomposition, 
and guided search for fair cycles in the general framework of abstraction 
refinement. 

In Chapters 7 and 8, we apply the idea of abstraction and succes- 
sive refinements to the basic symbolic computation algorithms in model 
checking. In Chapter 7, we focus on improving the performance of BDD 
based symbolic image computation and present the FARSIDE image com- 
putation algorithm. In Chapter 8, we discuss the variable decision or- 
dering of a SAT solver based on the DLL procedure in the context of 
bounded model checking. We then present a new variable ordering algo- 
rithm to improve the performance of the SAT checks in BMC. In both 
chapters, we conduct experiments to demonstrate the effectiveness of 
the proposed techniques. 

We conclude in Chapter 9 and point out some interesting research 
directions. 


Chapter 2 


SYMBOLIC MODEL CHECKING 


Model checking [CE81, QS81] is an algorithmic method for proving 
that a digital system satisfies a user-defined specification. Both the sys- 
tem and the specification must be formally specified: The model of the 
system must have a finite number of states; the specification, or property, 
is often expressed in temporal logics. In the model checking literature, 
the model and the property are often represented by the Kripke structure 
and a temporal logic formula, respectively. 

Given a model K and a property ó, model checking is used to check 
whether K models ¢, denoted by K E 4$. For properties specified 
in Computational Tree Logic (CTL [CE81, EH83]), the model check- 
ing problem can be solved by a set of least and/or greatest fixpoint 
computations [CES86]. For properties specified in Linear Time Logic 
(LTL [Pnu77]), model checking is often transformed into language empti- 
ness checking in aà generalized Büchi automaton. In this automata- 
theoretic approach [VW86], the negation of the given LTL formula is en- 
coded into a Büchi automaton, which is then composed with the model. 
The LTL model checking problem is then decided by checking the lan- 
guage of the composed system—the model satisfies the property if and 
only if the language of the composed system is empty. Therefore, the 
underlying LTL model checking algorithms are usually variants of algo- 
rithms for computing Strongly-Connected Components (SCCs). 

In this chapter, we first introduce the basic concepts and notations 
commonly used in model checking. We then review some of the fun- 
damental algorithms in symbolic model checking, which includes BDD 
based symbolic fixpoint computation, SCC hull and SCC enumeration 
algorithms, SAT and bounded model checking, and iterative abstraction 
refinement. 
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2.1 Finite State Model 


In model checking, we deal with a formal model of the given digi- 
tal system, known as the Kripke structure. A Kripke structure is an 
annotated finite-state transition graph. 


DEFINITION 2.1 A Kripke structure is a 5-tuple 
K = (S, So, T, A,A) , 


where S is a finite set of states, So C S is the set of initial states, 
T C Š x Š is the transition relation, A is a finite alphabet for which a 
set P of atomic propositions is given and A = 2", and A : S — A is the 
labeling function. 


We further require that the transition relation of a Kripke structure 
be complete; that is, every state has at least one successor. With this 
assumption, we can extend any finite state path in the state transition 
graph into an infinite one. 

As the standard representation of models in the model checking litera- 
ture, the Kripke structure has its origin in modal logic, the generalization 
of temporal logic. In modal logic, a certain formula is interpreted with 
respect to a state inside a universe, a domain of discourse, and a rela- 
tion establishing how the validity of a predicate changes from state to 
state. Temporal logic is a special case of modal logic that allows us to 
reason about how predicates evolve over time. In temporal logic model 
checking, a node or state of the Kripke structure represents the “state” 
of the given system at a certain time, and the change from state to state 
represents a time change. 

From an engineer's point of view, the Kripke structure is nothing but 
a labeled finite state machine (FSM). The additional features, i.e., the 
finite alphabet and a labeling function from states to sets of atomic 
propositions, make it possible to specify simple propositional properties 
on the finite state machine. These propositional properties, combined 
with some temporal operators, allow us to specify properties like ^—abort 
holds on all the states reachable from the initial states" or "from a 
state labeled req we will eventually reach a state labeled ack." We will 
introduce temporal logic operators in the next section. Now let us focus 
on propositional properties and take a look at the example FSM at the 
right-hand side of Figure 2.1. 

The FSM in Figure 2.1 has four states, among which three are reach- 
able from the single initial state a. Propositions p and q belong to the 
finite alphabet. With the labeling function and initial predicate indi- 
cated in Figure 2.1, the finite state machine is augmented into a Kripke 
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Figure 2.1. An example of the Kripke structure. 


structure defined as follows: 


S = {a,b,c,d} Ala) = {p} 
So = {a} A@) = {} 
T = {(a, a), (a, b), (b, C), (c, c), (d, d), (d, a)} A(c) = {p} 
P = {p,q} A(d = ig 


Given a sequential circuit, the construction of the finite state machine 
from the system description is straightforward. A digital circuit is often 
defined as an entity with memory elements (latches and flip-flops), com- 
binational logic gates, input signals, and internal wires. The transition 
functions of the memory elements are defined in terms of the current 
values of these memory elements and the input signals. 

Figure 2.2 gives an example circuit, in which we use the variables z+ 
and zo to represent the outputs of the two registers, and variables y; and 
yo to represent their data inputs. Note that after a clock cycle, the values 
of yı and yo will be propagated to the register outputs. Therefore, we 
often call xı and zo the present-state (or current-state) variables, and 
call y; and yo the next-state variables. In this example, we use the 
variable wo to represent the value of a primary input signal. 

States in the corresponding FSM are mapped to the different valua- 
tions of the set of memory elements. Edges in the state transition graph 
correspond to the changes of states among different clock cycles. For the 
example in Figure 2.2, we can write out the transition functions of the 
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Figure 2.2. A sequential circuit example. 


two registers as 


Q1: £1 A To V T1 ^ to V z1 ^ £o A nwo 
Yo: 2%, ^ —Zo ^ Wo V 21 ^ zo ^ Uo 


Given the values of present-state variables and the input signal, the val- 
ues of next-state variables are determined by their transition functions. 
When the current values of the two registers are (zi = 0,zo = 0), for 
instance, their values at the next clock cycle will be (yj — 0,yo — 0) 
for wo = 0, and (y1 = 0,yo = 1) for wo = 1. If we use the state en- 
coding scheme and labeling functions described on the left-hand side of 
Figure 2.1, we will get the right-hand side Kripke structure in the same 
figure. 

Since the number of memory elements in a sequential circuit is finite, 
there are only a finite number of states. However, There is a well-known 
state explosion problem. The total number of states in the FSM can 
be as large as 2" for a system with n binary state variables. Due to its 
exponential dependence on the number of state variables, the number 
of states of the model can be extremely large even for a moderate-size 
system. 

Some digital systems may have an infinite number of states. Soft- 
ware with recursive function calls and unbounded data structures, for 
instance, fall into this category. Other examples include timed systems 
and hybrid systems [ACH * 95, AH96], in which the state variables can 
be of unbounded integer or even real type. Since model checking re- 
quires the Kripke structure to be finite-state, before we can apply model 
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checking, a certain degree of abstraction is needed to extract suitable 
verification models from these systems. In general, abstraction used for 
this purpose is either under-approximation or over-approximation. The 
process of mapping an infinite state space into a finite state space, by 
itself, is an important research topic, and is beyond the scope of this 
book. In the sequel, we assume that the finite state model of a given 
system, or the Kripke structure, is already available. 


2.2 Temporal Logic Property 


Propositional logic is the basis for specifying properties. À proposition 
is a declarative sentence about the Kripke structure that is either true 
or false. Propositions are represented by a set of propositional variables 
p.d... plus the truth values true and false. A formula consisting of a 
propositional variable is called an atomic proposition. 'The evaluation of 
an atomic proposition maps to a set of states in the Kripke structure. 

Propositional logic formulae are defined in terms of atomic proposi- 
tions with the common logical connectives. 


DEFINITION 2.2 A propositional logic formula is defined as follows: 


m atomic propositions are propositional formulae; 
m if is a propositional formula, then —@ is a propositional formula; 


m if ó and b are propositional formulae, then ó ^, ó V v, ó v, 
Q = w are propositional formulae. 


In the set of logical connectives, the unary operator negation (—) and 
the binary operation logical AND (^) constitute a minimal subset that 
is sufficient for defining propositional logic. Besides A, there are 15 other 
binary logical connectives; however, all of them can be expressed in terms 
of ^ and A. For example, under the De Morgan's law the formula ¢ V w 
can be rewritten into —(-¢ ^). The “implies” operator — means 
“only if", and therefore ó — % is equivalent to ^ó V v. Similarly, the 
formula $ — y is equivalent to 2$ ^ — V ó ^v. 

Propositional logic is incapable of reasoning about the evolution of 
valuations over time. When the truth of a property depends on not 
only the present valuation, but also on the valuations in the past or 
in the future, we need temporal logics. The most common temporal 
logics to express system properties are Computational Tree Logic (CTL) 
and Linear Time Temporal Logic (LTL). CTL and LTL are subsets of 
the more general CTL*. In this book, we will focus on Linear Time 
Temporal Logic, but we will also briefly describe the Computational 
Tree Logic, since some of its operators will be used in our discussion of 
model checking algorithms. 
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There are two very different, ways of modeling time in temporal logics. 
The linear time model assumes that each time instance has exactly one 
successor; the branching time model, on the other hand, allows several 
successors for each time instance. LTL is based on the linear time model. 
LTL formulae specify properties about the future of each individual ex- 
ecution trace such as the condition ack will eventually be true, or that 
the condition busy will be true until another condition done becomes 
true. Logics based on the branching time model, such as CTL, deal with 
all possible execution traces. CTL formulae can specify properties such 
as that if the condition reset is true then on all paths the condition 
reset done will eventually be true. 

LTL formulae are defined in terms of atomic propositions, the usual 
logic connectives, as well as linear time temporal operators. The two 
basic temporal operators in LTL are X and U, called nezt and until, 
respectively. The first operator is unary and the second is binary. The 
formula X ó means that $ holds at the next point of time. The for- 
mula $ U% means that ¢ has to hold until ~ becomes true, and w will 
eventually become true. 


DEFINITION 2.3 A Linear Time Temporal Logic (LTL) formula is de- 
fined recursively as follows: 


m atomic propositions are LTL formulae; 
m if and are LTL formulae, so are ^, PAW, and dV v; 
u ¿f and V are LTL formulae, so are X ó and QU v; 


Besides X and U, there are other temporal operators including G for 
globally, F for finally, and R for release. The formula G ó means that ¢ 
has to hold forever. The formula F ó means that ó will eventually be 
true. The formula $ R means that Y% remains true before the first time 
@ becomes true (or forever if ó remains false). These three temporal 
operators can be expressed in terms of the two basic ones: 


Fó = trueU% 
Gọ = —=F—% 
RY = ~Y Ung) 


The semantics of LTL formulae are defined for an infinite path z = 
(so, 81,-..) of the Kripke structure, where s; € S is a state, so is an 
initial state, and T(s;,s;.1) evaluates to true for all ¿ > 0. The suffix 
of z starting from the state s; is represented by «^. We use K, nt E= $ 
to represent the fact that $ holds in a suffix of path z of the Kripke 
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structure K. The property $ holds for the entire path z if and only 
if K,«? E: à. When the context is clear, we will omit K and rewrite 
K,n' E: é into z: — ¢. The semantics of LTL formulae are defined 
recursively as follows: 





v E true always holds 

THY iff 1° |= ç 

"Ee iff r fo 

rE pA if r H ypandtEw 

TF X o iff z! lo 

mE pUY iff 3i > 0 such that 7? H and for all 0 <j <i, 
"Eo 


mEypRy iff for all i > 0, z° |= ; or dj > 0 such that 
mw E= @ and for all 0 < i < j, z* H y 


The Kripke structure K satisfies an LTL formula ¢ if and only if all 
paths from the initial states do. This means that all LTL properties are 
universal properties in the sense that we can add the path quantifier A 
as a prefix without changing the meaning of the properties. That is, 
K E óis equivalent to K E A$, where the path quantifier A means 
$ holds for all computation paths. Another path quantifier is E, which 
stands for there ezists a computation path. E is not used in LTL, but 
both A and E are used in CTL. 

An LTL formula is in the normal form if negation appears only in 
front of propositional formulae. For instance, the formula F ~F p is not 
in the normal form since negation is ahead of the temporal operator F; 
on the other hand, the equivalent formula F G ^p is in the normal form. 
We can always rewrite an LTL formula into normal form by pushing 
negation inside temporal operators. The following rules can be applied 
during the rewriting: 


Gp = ~F ap 
Fp = trueUp 


Since an LTL formula ó is a universal property and is equivalent to A %, 
the negation of ó should be the existential property E —9. 

The two path quantifiers are an integral part of Computational Tree 
Logic (CTL), and are used explicitly to specify properties related to ex- 
ecution traces in the computation tree structure. A (for all computation 
paths) specifies that all paths starting from a given state satisfy a prop- 
erty; E (for some computation paths) specifies that some of these paths 
satisfy a property. 
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DEFINITION 2.4 A Computational Tree Logic (CTL) formula is defined 
recursively as follows: 


m atomic propositions are CTL formulae; 


m ify and b are CTL formulae, then —p, p Aw, and pV sb are CTL 
formulae; 


m ify andy are CTL formulae, then EX y, Ex Uy, and EG o are CTL 
formulae. 


A CTL formula is in the normal form if negation appears only in 
front of propositional formulae. Formula 3 AXp is not in the normal 
form since negation is ahead of the temporal operator AX; on the other 
hand, the equivalent formula EX ~p is in the normal form. We can 
always rewrite a CTL formula into normal form by pushing negation 
inside temporal operators. The following rewriting rules can be applied 
during normalization: 


AXp = AEX-p 
AGp = -7EF-p 
ApUq == ~(E ~g U mp A —q) ^ 3 EG ^q 
AFp = AtrueUp 
EF p EtrueUp 


Many interesting properties in practice can be expressed in both LTL 
and CTL. However, there are also properties that can be expressed in 
one but not the other. The difference between an LTL formula and a 
CTL formula can be very subtle. For instance, the LTL formula F Gp 
holds in the Kripke structure in Figure 2.3, but the CTL formula AF AG p 
fails. (In the Kripke structure, p and q are state labels.) 

The reason is that the LTL property is related to the individual paths, 
and on any infinite path of the given Kripke structure we can reach the 
state c from which p will holds forever. The CTL formula AF AG p, on 
the other hand, requires that on all paths from the state a we can reach 
a state satisfying AG p. Note that the only state satisfying AG p is the 
state c; however, the Kripke structure does not satisfy AF{c}—-as shown 
in the right-hand side of the figure, the left most path of the computation 
tree is à counterexample. On this particular path, we can stay in the 
state a while reserving the possibility of going to the state b (where p 
does not hold). Therefore, F Gp and AF AG p represent two very similar 
but different properties. 

'The above example shows that LTL and CTL have different express- 
ing powers. Some LTL properties, like F Gp, cannot be expressed in 
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Figure 2.8. A Kripke structure and its computation tree. 


CTL. There are also CTL properties that cannot be expressed in LTL; 
an example in this category would be AG EF p. Both LTL and CTL are 
strict subsets of the more general CTL* logic [EH83, EL87]. The re- 
lationship among LTL, CTL, and CTL* is given in Figure 2.4. In this 
book, we focus primarily on LTL model checking. Readers who are in- 
terested in CTL model checking are referred to [CES86, McM94] or the 
book [CGP99]. 


CTL* 


Figure 2.4. The relationship among LTL, CTL, and CTL*. 





We have used the term universal property during previous discussions. 
Now we give a formal definition of universal and existential properties. 


DEFINITION 2.5 A property ¢ is a universal property if removing edges 
from the state transition graph of the Kripke structure does not reduce 
the set of states satisfying ó. A property w is an existential property if 
adding edges into the state transition graph of the Kripke structure does 
not reduce the set of states satisfying w. 
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It follows that all LTL properties and ACTL (the universal fragment 
of CTL) properties are universal. The existential fragment of CTL, or 
ECTL, is existential. For the propositional u-calculus formulae, those 
that do not use EX and EY in their normal forms are universal. 

Temporal logic properties can also be classified into the following two 
categories: safety properties and liveness properties. The notion of 
safety and liveness was introduced first by Lamport [Lam77]. Alpern 
and Schneider [AS85] later gave a formal definition of both safety and 
liveness properties. Informally, a safety property states that something 
bad will not happen during a system execution. Liveness properties 
are dual to safety properties, expressing that eventually something good 
must happen. The distinction of safety and liveness properties was orig- 
inally motivated by the different techniques for proving them. 

We can think of a property as a set of execution sequences, each of 
which is an infinite sequence of states of the Kripke structure. A prop- 
erty is called a safety property if and only if each execution violating 
the property has a finite prefix violating that property. In other words, 
a finite prefix of an execution violating the property (bad thing) is ir- 
remediable no matter how the prefix is extended to an infinite path. 
Safety properties can be falsified in a finite initial part of the execution, 
although proving them requires the traversal of the entire set of reach- 
able states. The invariant property Gp or AGp, which states that the 
propositional formula p always holds, is a safety property. Other safety 
properties include mutual exclusion, deadlock freedom, etc. 

A property is a liveness property if and only if it contains at least 
one good continuation for every finite prefix. This corresponds to the 
intuition that it is still possible for the property to hold (good thing to 
happen) after any finite execution. Liveness properties do not have a 
finite counterexample, and therefore in principle cannot be falsified after 
a finite number of execution steps. An example of liveness property is 
G(p — Fq), which states that whenever the propositional formula p is 
true, the propositional formula g must become true at some future cycle 
although there is no upper limit on the time by which q is required to 
become true. Other liveness properties include accessibility, absence of 
starvation, etc. 


2.3 Generalized Btichi Automaton 


An LTL formula ¢ always corresponds to a Biichi automaton Ag that 
recognizes all its satisfying infinite paths. In other words, the Biichi 
automaton Ag contains all the logic models of the formula ¢. If we 
consider the Kripke structure as a language generator and the Biichi 
automaton Ag as a language recognizer, then we have K = $ if and only 
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if all infinite words generated by K are accepted by Ag. Therefore, the 
LTL model checking problem can be translated to w-regular language 
containment checking. Since checking language containment between 
two Büchi automata in general is PSPACE-complete [Kur94], it follows 
that LTL model checking is PSPACE-complete. 

In practice, however, LTL model checking is often translated into 
language emptiness checking in a generalized Büchi automaton. This 
automata-theoretic approach [VW'86] consists of the following three steps: 


1 we negate the given property ¢ and translate it into a Büchi au- 
tomaton Ag, which accepts all the infinite paths that do not satisfy 
$; 


2 we then compose the model K and the property automaton A~¢ 
together. The system produced by parallel composition, denoted by 
(K||A-¢), consists of only those infinite paths of K that are accepted 
by Ax¢; 


3 finally, we check whether the language of the composed system is 
empty. 


If the language is empty, then K H é since no infinite path in K is 
accepted by A..g. If the language is not empty, any accepting run in the 
composed system serves as a counterexample to K E à. 

LTL model checking via language emptiness has the same worst-case 
complexity bound as the language containment based approach, which 
is linear in the number of states of the model, but exponential in the 
length of the LTL formula. The exponential blow-up comes from the 
translation from LTL formulae to Büchi automata. However, this is 
often acceptable in practice, because user specified LTL formulae are 
usually small compared to the size of the model. 

In the automata-theoretic approach, we can use the labeled general- 
ized Büchi automata as a unified representation for the model K, the 
property automaton A_g, as well as the composed system(X||. A54). A 
labeled generalized Büchi automaton is simply a Kripke structure aug- 
mented by a set of acceptance conditions. In other words, we can view 
the model K as a special case of the labeled generalized Büchi automaton 
whose only acceptance condition is satisfied by all paths. 


DEFINITION 2.6 A labeled generalized Büchi automaton is a siz-tuple 
A= (5,00, T, A,A, F) ; 


where S is the finite set of states, So C Š is the set of initial states, 
T C Š x S is the transition relation, A is a finite alphabet for which a 
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set P of atomic propositions is given and A = 2P, A: S — A is the 
labeling function, and Z € 2° is the set of acceptance conditions. 


A run of A is an infinite sequence p = 59,$1,... over S, such that 
so € So and for all ¿ > 0, (si, 5:41) € T. A run p is accepting or fair 
if for each fair set F; € F, there exists s; € F; that appears infinitely 
often in p. The automaton accepts an infinite word o = 909,01,... in 
A" if there exists an accepting run p such that for all i > 0, e; € A(pi). 
The language of A, denoted by L(A), is the subset of A" accepted by 
A. Note that the language of .A is nonempty if and only if .A contains a 
reachable fair cycle—a cycle that is reachable from an initial state and 
intersects with all the fair sets. 

We have defined the automata with labels on the states, not on the 
edges. The automata are called generalized Büchi automata because 
multiple acceptance conditions are possible. À state s is complete if for 
every a € A, there is a successor s' of s such that a € A(s’). A set of 
states, or an automaton, is complete if all of its states are. In a complete 
automaton, any finite state path can be extended into an infinite run. 
In the sequel all automata are assumed to be complete. 

We define the concrete system A as the synchronous (or parallel) 
composition of a set of submodules. Composing a subset of these sub- 
modules gives us an over-approximated abstract model A’. In symbolic 
algorithms, A and A’, as well as the submodules, are all defined over the 
same state space and agree on the state labels. Communication among 
submodules then proceeds through the common state space, and com- 
position is characterized by the intersection of the transition relations. 


DEFINITION 2.7 The composition A, || Ag of two Büchi automata Ai 
and Ag, where 


Ai = (S, $91, Ti, A, A, Fi) E 
Ae = (S, So2, 15, A, A, Fo) ; 


is a Büchi automaton A = (S, So, T, A, A, F) such that, So = Soi N Sos, 
T —'T4 T$, and Z = Fy Ú 7°. 


In Figure 2.5, we give an example to show how the automata-theoretic 
approach for LTL model checking works. The model or Kripke structure 
in this example corresponds to the circuit in Figure 2.2. We are inter- 
ested in checking the LTL property ¢ = F G p; that is, eventually we will 
reach a point from which the propositional formula p holds for ever. 

First, we create the property automaton A_, that admits all runs sat- 
isfying the negation of ó, which is 2$ or GF ~p. Runs satisfying GF —p 
must visit states labeled ~p infinitely often. There are algorithms to 
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K || Ang 


Figure 2.5. An LTL model checking example. 


translate a general LTL formula to a Büchi automaton; readers who are 
interested in this subject are referred to [VW86, GPVW95, SB00]. Since 
our example is simple enough, we do not need to go through the detailed 
translation algorithm to convince ourselves that the automaton in Fig- 
ure 2.5 indeed corresponds to GF ~p. In Figure 2.5, states satisfying 
the acceptance condition are represented as double circles. The prop- 
erty automaton has only one acceptance condition {0}, meaning that 
any accepting run has to go through the state 0 infinitely often. We 
assume that all states in the Kripke structure K are accepting; that is, 
the acceptance condition of K is (a,b, c, d). 

After composing the property automaton with the model, we have 
the composed system at the bottom of Figure 2.5, whose acceptance 
condition is {a0, b0, c0}. Note that in the creation of K||.A.; we have 
used the parallel composition; that is, only transitions that are allowed 
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by both parents are retained. The final acceptance condition is also the 
union of those of both parents. Since the one for K consists of the entire 
state space, it is omitted. Finally, we check language emptiness in the 
composed system by searching for a run that goes through some states 
in the fair set {a0, b0, cO} infinitely often. It is clear that the language of 
that system is empty, because no run visits any of these states infinitely 
often. Therefore, the property $ holds in K. 

Whether the language of a Büchi automaton is empty can be decided 
by evaluating the temporal logic property EG¢,i, true on the automaton. 
In other words, the language of the composed system is empty if and 
only if no initial state of the composed system satisfies EGfair true. The 
property is an existential CTL formula augmented with a set of Büchi 
fairness constraints; for our running example, fair = {{a0,b0,cO}}. In 
a run satisfying this property, a state in every F; € Z must be visited 
infinitely often. The CTL formula under fairness constraints can be 
decided by a set of nested fixpoint computations: 


EGgi, true = vZ. EX A EZU(ZAR) , 
FEF 
where v denotes the outer greatest fixpoint computation, and EU repre- 
sents the embedded least fixpoint computations. When a monotonically 
increasing transform function f is applied repeatedly to a set Z, we de- 
fine f(Z) = f(f(...f(Z))) and declare Z as a least fixpoint if f® (Z) = 
fern (Z). Conversely, when a monotonically decreasing transform func- 
tion g is applied repeatedly to a set Z, we define g® (Z) = g(g(...9(Z))) 
and declare Z as a greatest fixpoint if g) (Z) = g+) (Z). When we eval- 
uate the above formula through fixpoint computation, the initial value 
of the auxiliary iteration variable Z can be set to the entire universe. 
For our running example, 


Fy = {a0, b0, c0} 
Z? = {a0, b0, c0, a1, b1, c1, a2, 52, c2, a3, b3, c3) 


Zi! —EXEZ?U(Z9 ^ R) 
= EX E{a0, b0, c0, a1, b1, cl, a2, b2, c2, a3, b3, c3) U (a0, 60, cO] 
= EX{al, b0} 
= {al} 


Z? -EXEZ!U(Z! ^ Fo) 
= EXE(a1) Uí } 
= EX{ } 
={} 
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Since no state in the composed system satisfies EGfair true, the language 
is empty. This method for deciding EGfair true is known as the Emerson 
and Lei algorithm [EL86], which is the representative of a class of SCC 
hull algorithms [SRB02). In general, the evaluation of EX and EU op- 
erators does not have to take the above alternating order; they can be 
computed in arbitrary orders without affecting the convergence of the fix- 
point [FFK*01, SRB02]. All SCC hull algorithms share the same worst- 
case complexity bound — they require O(n?) symbolic steps, where 7 is 
the number of states of the composed model. À symbolic step is either a 
pre-image computation (finding predecessors through the evaluation of 
EX) or an image computation (finding successors through the evaluation 
of EY, the dual of EX). 

Another way of checking language emptiness is to find all the strongly 
connected components (SCCs) and then check whether any of them sat- 
isfies all the acceptance conditions. If there exists a reachable non-trivial 
SCC that intersects every F; € Z, the language of the Büchi automaton 
is not empty. An SCC consisting of just one state without a self-loop is 
called trivial. In our running example, the reachable non-trivial SCCs of 
the composed system are {al} and {cl}. Since none of the non-trivial 
SCCs intersects the fair set (a0, b0, c0}, the language of the system is 
empty. 

An SCC is a maximal set of states such that there is a path between 
any two states. A reachable non-trivial SCC that intersects all accep- 
tance conditions is called a fair SCC. An SCC that contains some initial 
states is called an initial SCC. An SCC-closed set of A is the union 
of a collection of SCCs. The complete set of SCCs of A, denoted by 
II(.A), forms a partition of the states of A. Likewise, the set of disjoint 
SCC-closed sets can also form a partition of the state space S. A SCC 
partition IL, of S is a refinement of another partition IIo of S if for every 
SCC or SCC closed set C3 € II, there exists Co € II such that C1 C Co. 


An SCC (quotient) graph is constructed from a graph by contract- 
ing each SCC into a node, merging parallel edges, and removing self- 
loops. The SCC graph of A, denoted by Q(.A), is a directed acyclic 
graph (DAG); it induces a partial order: the minimal (maximal) SCC 
has no incoming (outgoing) edge. Reachable fair SCCs, by definition, 
contain accepting runs that make the language non-empty. Therefore, 
a straightforward way of checking language emptiness is to compute all 
the reachable SCCs, and then check whether any of them is a fair SCC. 


OBSERVATION 2.8 The language of a Büchi automaton is empty if and 
only if it does not have any reachable fair SCC. 
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Tarjan's explicit SCC algorithm using depth-first search [Tar72] can 
be used to decide language emptiness. The algorithm can be classified as 
an explicit-state algorithm because it traverses one state at a time. Tar- 
jan's algorithm has the best asymptotic complexity bound—linear in the 
number of states of the graph. However, for model checking industrial- 
scale systems, even the performance of such a linear time algorithm is 
not be good enough due the extremely large state space. A remedy to 
the search state explosion is a technique called “on-the-fly” model check- 
ing [GPVW95, Hol97], which avoids the construction of the entire state 
transition graph by visiting part of the state space at a time and con- 
structing part of the graph as needed. Its fair cycle detection is based 
on two nested depth-first search procedures. Early termination, efficient 
hashing techniques, and partial order reduction can be used to reduce 
memory usage during the search and the number of interleavings that 
need to be inspected. The scalability issue in explict-state enumeration 
makes them unsuitable for hardware designs, although they have been 
successful in verifying controllers and software. 


Symbolic state space traversal techniques are another effective way 
of dealing with the extremely large state transition graphs. Instead of 
manipulating each individual state separately, symbolic algorithms ma- 
nipulate sets of states. This is accomplished by representing the tran- 
sition relation of the graph and sets of states as Boolean formulae, and 
conducting the search by directly manipulating the symbolic represen- 
tations. By visiting a set of states at a time (as opposed to a single 
state), symbolic algorithms can traverse a very large state space using 
a reasonably small amount of time and memory. Thousands or even 
millions of states, for instance, can be visited in one symbolic step. 


An example of symbolic graph algorithms in the context of LTL model 
checking is the SCC hull algorithms introduced earlier (of which the 
Emerson and Lei algorithm [EL86] is a representative), wherein each im- 
age or pre-image computation is considered as a symbolic step. There is 
also another class of SCC computation algorithms based on symbolic tra- 
versal, called the SCC enumeration algorithms [XB99, BRS99, GPP03]. 
Both SCC hull algorithms and SCC enumeration algorithms can be used 
for fair cycle detection and therefore can decide EGfair true; however, 
some of the enumeration algorithms have better complexity bounds than 
SCC hull algorithms. For example, the Lockstep algorithm by Bloem 
et al. [BRS99] runs in O(n log m) symbolic steps, and the more recent 
algorithm by Gentilini e£ al. [@PP03] runs in O(7) symbolic steps. 
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2.4 BDD-based Model Checking 


In symbolic model checking, we manipulate sets of states instead of 
each individual state. Both the transition relation of the graph and the 
sets of states are represented by Boolean functions called characteristic 
functions, which are in turn represented by BDDs. Let the model be 
given in terms of 


m a set of present-state variables z = (zi, ..., Zm P, 
= a set of next-state variables y = (yi, ..., ym}; 
= aset of input variables w = {w1,...,wWn}, and 


the state transition graph can be represented symbolically by (T, 1), 
where T(x, w, y) is the characteristic function of the transition relation, 
and I(z) is the characteristic function of the initial states. A state is 
a valuation of either the present-state or the next-state variables. For 
m state variables in the binary domain B = {0,1}, the total number of 
valuations is |B|”. 

If a valuation of the present-state variables, denoted by z, makes the 
initial predicate I(Z) evaluate to true, the corresponding state is an initial 
state. Let Z, y, and w be the valuations of z, y, and w, respectively; 
the transition relation T'(X,15,4) is true if and only if under the input 
condition w, there is a transition from the state Z to the state 7. 

In our running example in Figure 2.2, the present-state variables, 
next-state variables, and inputs are {21,20}, (yi, yo], and wo, respec- 
tively. The next-state functions of the two latches, in terms of the 
present-state variables and inputs, are: 


Ay = (zi 9 zo) V (£1 ^ xo ^ —wo) , 
Ao = (721 A^ 720 A Awo) V (x4 A zo A —wo) ; 


Note that @ denotes the exclusive OR operator. Boolean functions T 
and I are given as follows: 


T = (yı > A1) A (yo > Ao) , 
I = 72; ^or. 


When (zı = 0,20 = 0), (yi = 0,yo = 1), and wo = 1, for instance, 
the transition relation T evaluates to true, meaning a valid transition 
exists from the state (0,0) to the state (0, 1) under this input particular 
condition. 

Computing the image or pre-image is the most fundamental step in 
symbolic model checking. The image of a set of states consists of all the 
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successors of these states in the state transition graph; the pre-image of 
a set of states consists of all their predecessors. In model checking, two 
existential CTL formulae, EX(D) and EY(D), are used to represent the 
image and pre-image of the set of states D under the transition relation 
T. With a little abuse of notation, we are using EX and EY as both 
temporal operators as sell as set operators. These two basic operations 
are defined as follows: 


EX7r(D) = (s|3s'€ D: (s,s') ET} , 
EYr(D) = (s |dseD:(s,s') € T). 


When the context is clear, we will drop the subscripts and use EX and 
EY instead. Given the symbolic representation of the transition relation 
T and a set of states D, the image and pre-image of D are computed as 
follows: 

EXT(D) = Jy, w.T(z,w,y) ^ Dy) , 

EYT(D) = 3z,w.T(zr,w,y) ^ D(z) . 


When we use them inside a fixpoint computation, we usually represent 
sets of states as Boolean functions in terms of the present-state variables 
only. Therefore, before pre-image computation and after image compu- 
tation, we also need to simultaneously substitute the set of present-state 
variables with the corresponding next-state variables. 

Many interesting temporal logic properties can be evaluated by ap- 
plying EX and EY repeatedly, until a fixpoint is reached. The set of 
states that are reachable from J, for instance, can be computed by a 
least fixpoint computation as follows: 


EP I =pZIUEY(Z) . 


Here EP I denotes the set of reachable states and u represents the least 
fixpoint computation. In this computation, we have Z? = and Z**! = 
IU EY(Z*) for all i > 0. That is, we repeatedly compute the image 
of the set of already reached states starting from the initial states J, 
until the result stops growing. Similarly, the set of states from which 
D is reachable, denoted by EF D, can be computed by the least fixpoint 
computation 


EF D = uZ.DUEX(Z) . 


This fixpoint computation is often called the backward reachability. 

The computation of EG D, on the other hand, corresponds to a great- 
est fixpoint computation. (EG p means that there is a path on which p 
always holds—in a finite state transition graph, such a path corresponds 
to a cycle.) It is defined as follows: 


EGD —vZ.DnEX(Z) , 
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where v represents the greatest fixpoint computation. In this computa- 
tion, we have Z? set to the entire universe and Z^*1 = D n EX(Z?) for 
all i > 0. 

The computation of EGfair true also corresponds to a set of fixpoint 
computations. As pointed out in the previous section, this is a CTL 
property augmented with a set of Bchi fairness constraints, and it can 
be used to decide whether the language of a Büchi automaton is empty. 
The formula can be evaluated through fixpoint computations as follows: 


EGfair true = vZ. EX A EZU(ZAF) . 
FEF 


The evaluation corresponds to two nested fixpoint computations, a least 
fixpoint (EU) embedded in a greatest fixpoint (vZ. EX}. In the previous 
section, we have given a small example to illustrate the evaluation of 
this formula. 

As mentioned before, we can use symbolic algorithms to enumerate 
the SCCs in a graph [XB99, BRS99, GPP03]. Conceptually, an SCC 
enumeration algorithm works as follows (here we take the algorithm in 
[XB99] as an example, for it is the simplest among the three and it 
serves as a stepping stone for understanding the other two). First, we 
pick an arbitrary state v as seed and compute both EF v and EP v. EFv 
consists of states that can reach v and EP v consists of states reachable 
from v. We then intersect the two sets of states to get an SCC (and the 
intersection is guaranteed to be an SCC). If the SCC does not intersect 
with all the fair sets F; € F, we remove it from the graph and pick 
another seed from the remaining graph. We keep doing that until we 
found an SCC satisfying all the acceptance conditions, or no state is 
left in the graph. Although SCC enumeration algorithms may have 
better complexity bounds than SCC hull algorithms, for industrial-scale 
systems, applying any of these algorithms directly to the concrete model 
remains prohibitively expensive. 

For certain subclasses of LTL properties, there exist specialized al- 
gorithms that often are more efficient than the evaluation of EGg true. 
One way of finding these subclasses is to classify LTL properties by the 
strength of the corresponding Büchi automata. According to [BRS99], 
the strength of a property Büchi automaton can be classified as strong, 
weak, and terminal. If the property automaton is classified as strong, 
checking the language emptiness of the composed system requires the 
evaluation of the general formula EGgi; true. Whenever the property au- 
tomaton is weak or terminal, language emptiness checking in the com- 
posed system only requires the evaluation of EF fair or EF EG fair, respec- 
tively. Note that the latter two formulae are much easier to evaluate 
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since they corresponds to a single fixpoint computation or two fixpoint 
computations aligned in a row, rather than two fixpoint computations 
but with one nested inside the other as in EGg;, true. 

Let us take the invariant property Gp as an example, whose cor- 
responding property automaton is denoted by Af >. The property au- 
tomaton is given at the left-hand side of Figure 2.6. The only acceptance 
condition of the automaton is (2), or Z = {{2}}. The SCC {2} is a fair 
SCC since it satisfies the acceptance condition; furthermore, the SCC is 
maximal in the sense that no outgoing transition exists. For the automa- 
ton at the left-hand side, we can mark State 1 accepting as well; that 
is, fair — (1,2). The automaton remains equivalent to the original one 
because both accept the same w-regular language. However, this new 
property automaton can be classified as a terminal automaton although 
according to the definition in [BRS99], the original one is classified as 
weak. For a terminal property automaton, the language of the composed 
system is empty as long as no state in the fair SCC is reachable. This 
is equivalent to evaluating the much simpler formula EF fair (which has 
a similar complexity as the reachability analysis). Note also that when 
EF fair is used instead of EGgi true, we may end up producing a shorter 
counterexample. 


true 


—e(oX ) > 
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—e(1) > 
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Automaton for (F-p) Same automaton, with more accepting states 


Figure 2.6. Two terminal generalized Büchi automata. 


2.4.1 Binary Decision Diagrams 

Set operations encountered in symbolic fixpoint computations, includ- 
ing intersection, union, and existential quantification, are implemented 
as BDD operations. BDDs are an efficient data structure for repre- 
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senting Boolean functions. BDD in its current form is both reduced and 
ordered, called the Reduced Ordered BDD (RO-BDD). ROBDD was first 
introduced by Bryant [Bry86], although the general ideas of branching 
programs have been available for a long time in the theoretical computer 
science literature. Symbolic model checking based on BDDs, introduced 
by McMillan [McM94], is considered as a major breakthrough in increas- 
ing the model checker's capacity, leading to the subsequently widespread 
acceptance of model checking in the computer hardware industry. 

Given a Boolean function, we can build a binary decision tree by 
obeying a linear order of decision variables; that is, along any path from 
root to leaf, the variables appear in the same order and no variable 
appears more than once. We further restrict the form of the decision 
tree by repeatedly merging any duplicate nodes and removing nodes 
whose if and else branches are pointing to the same child node. The 
resulting data structure is a directed acyclic graph. Conceptually, this is 
how we construct the ROBDD for a given Boolean function. In practice, 
BDDs are created directly in the fully reduced form without the need 
to build the original decision tree in the first place. BDDs representing 
multiple Boolean functions are also merged into a single directed graph 
to increase the sharing; such a graph would have multiple roots, one for 
each Boolean function. 

'The formal definition of a BDD is given as follows: 


DEFINITION 2.9 A BDD is a directed acyclic graph (® U V U (1), E) 
representing a set of functions f; : {0,1}" — (0,1). The nodes $ UV U 
(1) are partitioned into three subsets. 


m 1 is the only terminal node whose out-degree is 0. 


w V is the set of internal nodes whose out-degree is 2 and whose in- 
degree is 1. Every node v € V corresponds to a Boolean variable 
l(v) in the support of functions {fi}; the n variables {l(v)} in the 
entire graph are ordered as follows: if vj is a descendant of vi, then 
l(v;) < I(vi). 


m Ó is the set of function nodes whose out-degree is 1 and whose in- 
degree is 0; the function nodes are in one-to-one correspondence with 
the fj's. 


E is the set of edges connecting the nodes. The outgoing edge of function 
nodes may have the complement attribute. The two outgoing edges for 
a node v € V are labeled T and E, respectively. The E edges may have 
the complement attribute. We write (l(v), T (v), E(v)) to indicate an 
internal node and its two outgoing edges. 
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The set of functions represented by a BDD is defined as follows: (1) 
The function of the only terminal node, 1, is true. (2) The function 
of a regular edge is the function of the head node; the function of a 
complement edge is the complement of the function of the head node. 
(3) The function of a node v € V is I(v) ^ fr V al(v) ^ fg, where fr and 
fr are the functions of its T and E edges. (4) The function of ¢ € ® is 
the function of its outgoing edge. 

BDD provides a very compact representation for many Boolean func- 
tions found in practice, although in the worst case the size of a BDD 
may become exponential with respect to the number of support vari- 
ables. (An example for the worst-case blowup is a multiplier, which is 
known to have an exponential number of BDD nodes regardless of the 
BDD variable order [Bry86].) In addition to the compactness, BDDs 
are also easy to manipulate. Efficient algorithms exist for almost all the 
common set-theoretic operations. For example, the intersection or union 
of two BDDs takes time proportional to the product of their respective 
Sizes in the worst case. 

BDDs are also a canonical representation in the sense that with a fixed 
variable ordering, every Boolean function has a unique BDD representa- 
tion. Therefore, checking whether two Boolean functions are the same 
is reduced to a pointer comparison. Given a BDD, complementation or 
the validity check takes constant time as well. 

'The complexity of symbolic model checking depends on the size of the 
BDDs involved in the symbolic steps, such as the BDDs that represent 
the transition relation and sets of states. Because of this, the search for 
heuristics to avoid the BDD blow-up in the context of image computation 
and symbolic fixpoint computation has been one of the major research 
topics in formal verification. 

CU Decision Diagram (CUDD) [Som] is a public-domain decision di- 
agram package developed in the University of Colorado. CUDD has 
been used widely in both industry and academia. The package pro- 
vides a large set of operations to manipulate BDDs, Algebraic Decision 
Diagrams (ADDs) [BFG* 93], and Zero-suppressed Binary Decision Di- 
agrams (ZDDs) [Min93]. The latter two are variants of BDDs. In par- 
ticular, ADDs represent function from B™ to an arbitrary set (e.g., the 
integer domain), as opposed to B in BDDs. ZDDs represent switching 
functions like BDDs, but they are much more efficient than BDDs when 
the functions to be represented are characteristic functions of cube sets, 
or in general, when the ON-set of the function to be represented is very 
sparse. They are inferior to BDDs in other cases. The CUDD package 
also provides functions to convert BDDs into ADDs or ZDDs and vice 
versa, and a large assortment of variable reordering methods. 
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2.5 SAT and Bounded Model Checking 


SAT based Bounded Model Checking (BMC) is a complementary tech- 
nique to BDD based symbolic model checking. BMC was first introduced 
by Biere et al. in [BCCZ99]. Given a model and an LTL property, BMC 
searches in the given model for counterexamples of a finite length. The 
existence of a finite length counterexample is encoded as a Boolean for- 
mula, which is satisfiable if and only if the counterexample exists. The 
satisfiability problem of a Boolean formula can be decided by a SAT 
solver. Since modern SAT solvers [SS96, MMZ*01, GNO02] often suffer 
less from the potential search space explosion, in practice, SAT based 
bounded model checking can handle some industrial-scale circuits that 
are beyond the reach of BDD based techniques. 


In bounded model checking, one can keep increasing the counterex- 
ample length k (also called the unrolling depth) until either a counterex- 
ample is found or k exceeds a predetermined completeness threshold. A 
completeness threshold is a constant k, such that if we cannot find any 
counterexample shorter than or equal to ke, we have proved that the 
property holds. It is clear that k. < m, where 7 is the number of states 
of the Kripke structure, since any finite run of the Kripke structure with 
distinct states cannot be longer than that. Therefore, BMC can be re- 
garded as transforming the PSPACE-complete problem of LTL model 
checking into a finite number of Boolean satisfiability checks. Although 
each individual SAT problem in this process is NP-complete, the total 
number of SAT problems, or ke, can be exponential with respect to the 
number of state variables. In practice, the SAT checks often slow down 
significantly after k goes beyond a few hundred steps. 


A better completeness threshold than 7 would be the diameter of the 
state transition graph [KS03], i.e., the length of the longest shortest 
path between any two states. For safety properties, one can go one step 
further and use the reachable diameter of the graph, which is the length 
of the longest shortest path between an initial state and another state. 
While the diameter of a design may be exponential in the number of its 
state elements, Baumgartner and Kuehlmann [BK04] have observed that 
in practice it often ranges from tens to a few hundred regardless of design 
size. They also proposed a general approach for enabling the use of 
Structural transformations to tighten the bounds obtained by arbitrary 
diameter approximation techniques. However, despite these previous 
research works, computing a tight bound of the diameter of an extremely 
large graph remains very hard in practice. In the absence of a reasonably 
small completeness threshold, people use BMC primarily for detecting 
bugs rather than for proving properties. 
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Other SAT based methods have also been used to prove properties. 
This includes methods based on induction [SSS00, dRS03, GGW 03a], 
unbounded model checking through the enumeration of all SAT solu- 
tions [GYAG00, McM02, KP03, GGA04], and the interpolation based 
method [MA03]. Other more recent works have extended the BMC in- 
duction proof methods to handle liveness properties [AS04, GGA05a]. 

The Boolean formula used to encode the existence of a finite length 
counterexample consists of two subformulae: ® = $4 A $>. The first 
subformula, denoted by ® m, captures all the length k execution traces 
that are possible in the Kripke structure, all of which start from the 
initial states. The second subformula, denoted by 9p, captures the con- 
straint for a length k path to violate the given property. The conjunction 
of the above two subformulae captures all the length k counterexamples 
with respect to the property. Such a counterexample exists if and only 
if the Boolean formula has a satisfiable assignment. 

First, we explain how to create the subformula ®y. We use V to 
represent the set of state variables (or latches) and U to represent the 
rest of the signals (primary inputs, outputs, and signals of internal logic 
ua We then replicate these variables at every clock cycle: we use 

i (vi,...,vi) to represent the set of state variables at the i-th 
ae siame aud U* = {ut,...,ul,} to represent the set of other signals 
at the i-th time frame. Now we can unroll the sequential circuit by 
making multiple copies of the symbolic transition relation, and create a 
combination circuit. When the BMC unrolling depth is k, 


prego Jv Iesus 
1<i<k 


where I(V°) requires that all paths must start from an initial state, and 
the rest is the conjunction of k copies of the transition relation. 

The transition relation copy at the i-th time frame is denoted by 
T(V-!, U-1, V*), which is the conjunction of elementary transition re- 
lations, 

Ty "pet Ve Actes a A HUS SS) 
1<;<n 1<j<m 


Each T; is called a gate relation for it describes the behavior of a logic 
gate. For instance, if u; is the output variable of a two-input and gate 
with inputs uw and up, then Tj = uj €» (uj ^ ur). However, if u; is 
a primary input to the circuit, then T; = 1. Each term of the form 
(vio ui ') equates a next-state variable to a combinational variable, 


j 
describing that the output of a logic gate is fed to the data input of the j- 
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th register. Adjacent, transition relation copies are effectively connected 
together through the use of shared state variables in V*. 

Next we explain the creation of the subformula ®p, which states that 
a path of length k must violate the given property. To explain the basic 
idea, we use the invariant property G p as an example. For the encoding 
of a general LTL formula in BMC, the readers are referred to the original 
BMC paper [BCCZ99]. Note that the property G p fails if and only if 
there is a path of length k from an initial state to a state labeled —p. 
Therefore, 

Bp =-P(V*) , 


which indicates that the last state is a “bad” state. We use P(V*) to 
denote the predicate that the state VË satisfies the propositional formula 
p. In general $> should be the conjunction of =P(V:) for 0 < i < k. 
However, if we start BMC with k = 0 and keep increasing the unrolling 
depth by 1 at a time, by the time it reaches k we know that the “bad” 
state cannot be found in the first (k — 1) depths; therefore, $p can be 
simplified into ~P(V*). To summarize, the entire BMC instance at the 
unrolling depth k is given as follows, 


=V") Á T(V U, v’) ARIES; 
1<i<k 


This formula can be viewed as a pure combinational circuit with some 
environmental constraints. Figure 2.7 illustrates such a view for the 
unrolling depth 2. 


1(V°) AP(V2) 





Figure 2.7. A bounded model checking instance. 


Now, we explain how to convert the Boolean formula ® into the Con- 
junctive Normal Form (CNF). This step is necessary because in practice, 


36 


we often use an off-the-shelf Boolean SAT solver to decide $, and most 
of the modern SAT solvers accept the CNF input format. Furthermore, 
CNF is also adopted by many solvers as an efficient data structure for 
internal representation. À CNF formula is the conjunction of a set of 
clauses, each of which is a disjunction of literals. A literal is the positive 
(or negative) phase of a Boolean variable. As an example, the following 
formula fragment 


(a V ac) ^ (b V 5c) A (2a V -b V c) ^ ... 


has three variables (a, b, and c), six literals (a, ~a, b, ~b, c, and c), 
and three clauses. 

Both general Boolean formulae and combinational circuits can be con- 
verted into CNF formulae in linear time if we are allowed to add auxiliary 
variables. The result CNF formula is also linear with respect to the size 
of the input formula or the size of the input circuit. Since the transition 
relation of a Kripke structure can be represented as a network of logic 
gates, we only need to consider the problem of encoding the individual 
combinational logic gates as conjunctions of clauses. The solution to the 
latter problem is actually very straightforward [CLR90]. For instance, 
a two-input and gate u; with inputs u; and u, has the following set of 
clauses: 


(ui V 7u;) ^ (ur V ^uj) ^ (mu; V Ur V uj) š 


Finally, we review a procedure used in many modern SAT solvers [SS96, 
MMZ^01, GN02] to decide the satisfiability of a CNF formula. A for- 
mula is satisfiable if and only if there exists a set of assignments to the 
variables that makes the formula true. It is clear that for the entire CNF 
formula to be satisfiable, each individual clause must also be satisfiable. 
A clause is satisfiable if and only if at least one of its literals evaluates to 
true. The SAT problem of a CNF formula can be solved by the Davis- 
Longeman-Loveland (DLL) recursive search procedure [DLL62]. Its ba- 
sic steps are making decisions and propagating the implications of these 
decisions. Selecting a literal and making it true is called a decision. If a 
clause has only one unassigned literal and all the other literals are false, 
it is called a unit clause. Every unit clause triggers an implication—its 
only unassigned literal has to be true, otherwise, the clause is no longer 
satisfiable. The process of applying implications iteratively until no unit 
clause is left is called Boolean Constrain Propagation (BCP). 

A decision and the corresponding BCP restrict our attention into a 
subformula or a subset of the original clauses, since the rest of the clauses 
have been made true. If we keep making decisions on free variables and 
performing BCP until no subformula remains to be decided, the formula 
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is proved to be satisfiable. However, if the remaining subformula is not 
satisfiable (i.e., some of its clauses become false), we need to backtrack 
and flip some of the previous assignments we made earlier. 


SATCHECK( ) 
while (true) 
if ( MAKEDECISION( ) ) 
while ( BCP( ) == CONFLICT ) 


level = CONFLICTANALYSIS( ); 
if (level « 0) 

return UNSAT; 
else 

BACK'TRACK(level); 


) 
} 


else 
return SAT; 


Figure 2.8. The DLL Boolean SAT procedure. 


The pseudo code of a DLL procedure is given in Figure 2.8. It makes 
decisions and then applies BCP inside the while loop. If all the variables 
have been assigned and no conflict occurs, the procedure MAKEDECI- 
SION will return false and the procedure SATCHECK will return a com- 
plete set of variable assignments. If a conflict occurs after a partial 
set of assignments-—some clauses become false, the procedure BCP will 
return CONFLICT, indicating that a previous decision is not appropri- 
ate. Inside the procedure CONFLICTANALYSIS, the level of that previous 
decision is identified by conflict analysis [SS96], following which, the in- 
appropriate decision is recovered by backtracking. Before backtracking, 
a conflict clause learned from this analysis can be added to the clause 
database (i.e., conjoined with the original formula) to prevent the search 
from repeating this mistake in the future. When the backtracking level 
is less than 0, the given formula is proved to be unsatisfiable—there is 
a conflict before we make any decision. 

There are also efficient implementations of the Boolean SAT solver 
that do not rely solely on the CNF representation [KGP01, KPKG02]. 
Since many SAT problems in EDA, including bounded model check- 
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ing, are typically derived from the circuit structure, it is advantageous 
to work directly on the circuit representation, and use circuit-specific 
heuristics to guide the search. In [GAG102], Ganai et al. proposed a 
hybrid SAT solver which combines the benefits of circuit-based and CNF 
SAT techniques. Other more recent efforts have also combined the ad- 
vantages of multiple symbolic representations, including circuit graphs, 
CNF, and BDDs [GGW* 03b, IPC03, LWCH03, JS04, JASO4] 


2.6 Abstraction Refinement Framework 


Abstraction refinement was first introduced by Kurshan [Kur94] in 
checking linear time properties specified by w-regular automata. It is 
an important technique to bridge the capacity gap between the model 
checker and large digital systems. When a model cannot be directly 
handled by the model checker due to the limited capacity of the model 
checking algorithms, abstraction can be used to simplify the model by 
removing the information that is irrelevant to verification. To simplify 
verification, we want to retain only the relevant details with respect to 
deciding the property at hand. The key issue in abstraction refinement 
is identifying in advance which part of the model is relevant and which 
is not. 

There are automatic techniques for computing a simplified model in 
which an certain class of temporal logic properties can be preserved. 
For instance, bi-simulation based reduction [Mil71, DHWT91] preserves 
the full propositional u-calculus (hence the entire CTL since all CTL 
formulae can be evaluated through the translation to fixpoint compu- 
tations in propositional u-calculus). A nice feature of these techniques 
is that we only need to compute the reduction for à given model once, 
and then use the simplified model to check all kinds of properties in 
that class. In practice, however, bi-simulation and other property pre- 
serving abstractions are less attractive because they are either hard to 
compute, or do not achieve a drastic reduction. À previous study by 
Fisler and Vardi [FV99] has demonstrated that bi-simulation relation is 
often expensive to compute and bi-simulation based simplification does 
not speed up CTL model checking. 

A more practical abstraction approach is called property driven ab- 
straction, which often results in a significantly smaller model that pre- 
serves or partially preserves a given property (as opposed to a class of 
properties). This abstraction approach is frequently used in the iter- 
ative abstraction refinement loop. There are various ways of deriving 
such an abstract model [BSV93, Lon93, CHM* 96a]. Most of them cre- 
ate abstract models that are upper bounds or over-approximations of the 
exact system, which may have more behavior than the concrete model. 
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Over-approximated abstraction may produce false negatives when being 
used to verify universal properties such as AG p: If a property holds in 
the abstract model, it also holds in the concrete model; however, if the 
property fails in the abstract model, it may still pass in the concrete 
model-—in this case, the property is still undecided. 

There are also lower bounds or under-approximations of the exact 
system. These abstractions are conservative as well because they may 
produce false positives when being used to check universal properties. 
Since the abstract models have less behavior than the exact system, if 
a counterexample is found in the abstract model, then it is also a coun- 
terexample in the exact system. However, if no counterexample exists 
in the abstract model, we cannot conclude that the property is true. In 
other words, these abstractions can only refute a property but cannot 
prove it. Note that one can also use under-approximations and over- 
approximations simultaneously in a single iterative refinement process, 
to tighten up abstraction from both ends. 


Initial Abstraction 
Ç 9 
Model Checking 
















Figure 2.9. The abstraction refinement framework. 


Since the property driven abstraction is conservative, we need an it- 
erative process to refine the abstract model until it becomes deciding. 
In abstraction refinement, one seeks the simplest abstraction that can 
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either prove or refute the given property. Such abstraction is called the 
final or deciding abstraction for the given model checking problem. Fig- 
ure 2.9 shows the generic framework for iterative abstraction refinement. 
Given a model and an LTL property, for instance, one can build a very 
coarse initial abstraction that is an over-approximation of the concrete 
model [Kur94, HITK96]. We then use a model checker to decide whether 
the abstract model satisfies the property. If the property is satisfied by 
the abstract model, it is also satisfied by the concrete model. If the 
property fails in the abstract model, there is no conclusive result yet. 

At this point, the model checker returns an abstract counterexample 
showing how the property is violated. Inside counterexample analysis, 
we check whether this abstract counterexample contains a valid path in 
the concrete model. One way of doing that is using the concretization 
test to reconstruct the abstract path in the concrete model [CGJ*00, 
WHLT+01]. If this is possible and we find a real path within the given 
abstract counterexample, the property is refuted. Otherwise, the ab- 
stract counterexample is declared as spurious, which means that some 
important information of the exact system is missing and therefore the 
current abstraction needs to be refined. 

During refinement, one can use spurious counterexamples to guide the 
identification of missing information in the current abstract model [Kur94, 
CGJ*00]. Often, the immediate goal in counterexample guided refine- 
ment is to remove the set of spurious counterexamples. That is, one 
searches for a set of refinement variables such that, adding them into 
the current abstract model removes the spurious counterexamples. 

After computing the set of refinement variables, we can build the new 
abstract model by including their corresponding bit transition relations. 
We then start the model checker again. This iterative process terminates 
when either the property is decided, or the available computing resources 
(i.e., CPU time and memory) are depleted. 


Chapter 3 


ABSTRACTION 


Abstraction is a key to model checking large real-world systems. The 
idea is using simplified model to help verify the original model. The 
definition of simplified model depends on the type of algorithms used in 
the verification process. In this chapter, abstraction is defined in terms 
of the simulation relation, and in such à way that the abstract models 
can be constructed directly from a high level description of the system, 
even before the concrete model of the system is available. 


The abstraction granularity is very important in achieving a higher 
abstraction efficiency. In previous work, the abstraction granularity is of- 
ten restricted at the state variable level—binary state variables, together 
with their bit transition relations, are treated as atoms. This approach 
is too coarse for most industrial-scale circuits with large combinational 
logic cones. In this chapter, we propose a finer grain abstraction ap- 
proach [WHS04] which goes beyond the usual state variable level. In 
the extreme case, we can treat every combinational logic gate as an 
atom for abstraction; refinement then becomes a process of synthesizing 
a final abstract model with the fewest logic gates. 


Abstract models used in this chapter are over-approximations of the 
original model. If the property fails in an abstract model, we will system- 
atically analyze the abstract counterexamples. For invariant properties, 
we propose a data structure called the synchronous onion rings (SORs) 
to symbolically capture all the shortest counterexamples and no other 
counterexamples. The SORs are then used to concretize all the shortest 
counterexamples simultaneously with a SAT solver, and to compute the 
refinement set. 
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3.1 Introduction 


The concrete model is considered as a formal description of the com- 
plete behavior of the system. Abstraction preserves only part of the 
behavior that is relevant to the verification of the given property, in 
the hope that the simplified model is easier to verify. The definition of 
simplified model depends on the type of algorithms used in the verifi- 
cation process. For explicit state traversal algorithms whose complexity 
depends on the number of states, the simplification often aims at reduc- 
ing the size of the state space. The complexity of symbolic state space 
traversal algorithms depends on the size of the symbolic representation, 
and therefore the abstract model must be simplified to provide a more 
compact BDD representation of the transition relation and the sets of 
states. 

The properties under verification must be at least partially preserved 
during abstraction. Based on how well they preserve the properties, 
abstraction methods are classified into two categories: the property- 
preserving transformation and the conservative transformation. Let A 
be the original model, {¢} be a set of temporal logic properties, and A 
be the abstract model. Under a property-preserving transformation, for 
all the properties in (6), A E= ¢ if and only if A H ¢. Simplification 
based on bi-simulation and simulation equivalence [Mil71, DHWT91], for 
instance, are property-preserving transformations: bi-simulation based 
reduction preserves the entire propositional p-calculus, while simulation 
equivalence based reduction preserves the entire LTL. 

However, it is often the case that the more properties one wants to 
preserve, the less information of the system one can abstract away. Since 
the major concern in practice is to achieve a drastic simplification of 
the model, we are more interested in conservative transformations, even 
though properties may only be partially preserved. Simplification based 
on simulation relation is a conservative transformation. 


DEFINITION 3.1 A Büchi automaton A simulates A (written A < A) if 
there exists a simulation relation R C S x S such that 


1 for every initial state s € So, there is an initial state $ € So such that 
(s,s) € R; 


2 (s,S) € R implies that A(s) = A(S), and if (s,t) € T then there exists 
t€ S such that (8,t) € T, and (t,t) € R. 


If A < A, then L(A) C L(A) [Mil71, DHWT91], where L(A) denotes 
the language accepted by A. This is because a corresponding run in 
A exists for every run in A—that is, A has all possible behavior of A, 
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and maybe more. This is the reason why abstract counterexamples may 
contain transitions that are allowed in A but not in A, which produces 
“false negatives” when we are checking properties in A. Conservative 
abstraction can be improved by successive refinements, as is done in 
the iterative abstraction refinement framework. In practice, one often 
starts with a primitive abstract model that only partially preserves the 
property. Information about the concrete model is gradually added until 
the false negative result is completely removed. 

The mapping between abstract and concrete models can be described 
using the more general notion of Galois connection [BBLS92]. 


DEFINITION 3.2 A Galois connection from S to S isa pair of function 
a: 25 > 25 and Y: 25 — 25 that are called the abstraction function 
and the concretization function, respectively. œ and y are both complete 
and monotonic, and must satisfy the following conditions: 


a VX € 29, yoa(X) D X, and 
" Vx e 28, aoy(X) 2 X. 


Under the following condition, the Galois connection can be reduced to 
the simulation relation: if VX € 2°, a(EXr(7(X))) € EXa(X), then A < 


A. Although the Galois connection provides a more general framework, 
no easy and practical implementation has been proposed so far to exploit 
the flexibility provided by the extra generality. 

We shall show in the sequel that when simulation relation is used, 
little overhead is required to construct abstract models from the concrete 
model. The idea is to directly abstract the transition relation. For 
sequential circuits, their abstract models can be constructed directly 
from a high level description of the system, even before the concrete 
model of the system is available. This is in contrast to various predicate 
abstraction methods [GS97, DDP99, BMMRO01], which often need to 
spend a substantial amount of time in constructing the abstract model 
(also called abstraction computation). Abstraction used in this book 
relies on the simulation relation. 


3.2 Fine-Grain Abstraction 


Let the concrete model be represented as a generalized Büchi au- 
tomaton A = (T,I), where T (z,w,y) is the characteristic function of 
the transition relation and I(z) is the characteristic function of the set 
of initial states. The model is considered as the synchronous (parallel) 
composition of a set of submodules. In the simplest form, every binary 
state variable together with its bit transition relation is considered as a 
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submodule. Let J = (1,..., m} be the a permutation of the indexes of 
the state variables; then, 


Ts, w,y) = A Tj(z,w,yj) , 
jEJ 


where T} (x, w, yj) is the bit transition relation of the j-th binary state 
variable. Let A;(x,w) be the next-state function of the j-th state vari- 
able in terms of present-state variables and inputs, then T; = (yj = 
A;(x,w)). Note that T; depends on one next-state variable y; but on 
potentially all present-state variables in z. 

Over-approximations of the two Boolean functions T' and 7, denoted 
by T' and T, respectively, induce an abstract model that simulates .A. 
A straightforward way of building T' from T', which has been adopted 
by many existing algorithms, is to replace some T; by tautologies. Note 
that this approach treats the bit transition relation T; as an atom for 
abstraction—it is either included in or excluded from the abstract model 
completely. Since each T; corresponds to a binary state variable (or 
latch), the abstraction granularity is at the state variable level. As- 
sume that the abstract model A contain a subset of state variables 

= (1,.,k] C J, a subset £ C z of present-state variables, and a 


subset j C y of next-state variables; then T is defined as follows: 


T(z,w,$) = AT; (DE et w, yj), 
jej 


where variables in £ are called the visible state variables, and variables in 
£= x \ ĉ are called the invisible state variables. Bit transition relations 
corresponding to the variables in Z are abstracted away. T is an existen- 
tial abstraction of T' in the sense that an abstract transition exists if and 
only if it contains a concrete transition. The set of initial states I(£) is 
an existential projection of I(x): an abstract state is initial if and only 
if it contains a concrete initial state. xi. nus 
There are two different view of the abstract model A = (T, I): 


1 The abstract model is defined in exactly the same concrete state 
space, only with more transitions among the states and possibly more 
states labeled as initial. While the number of states of the model 
remains the same, the simplification is mainly in the size of the BDD 
representation of the transition relation. This interpretation appears 
natural when analyzing symbolic graph algorithms. 


2 The abstract model is defined in a reduced state space: a set of con- 
crete states in which any two states cannot be distinguished from each 
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other under the simplified transition relation T forms an equivalence 
class; mapping every equivalence class into a new state forms the ab- 
stract model in a reduced state space. The number of states in the 
reduced state space is also called the number of effective states. 'The 
worst-case complexity of the graph algorithms is determined by the 
number of effective states. It makes sense, especially for analyzing 
explicit graph algorithms, to consider the abstract model in a reduced 
state space. (In the analysis of some symbolic graph algorithms, the 
number of effective states can also be useful [BGS00].) 


Restricting the abstraction granularity at the state variable level, 
however, is not suitable for verifying industrial-scale systems with ex- 
tremely large combinational logic cones. The bit transition relation T; 
is either (y — Aj) or the tautology, depending on whether the cor- 
responding state variable is included or not, but it cannot be an arbi- 
trary Boolean function in between. However, there are often cases where 
not all the logic gates in the combination logic cone are necessary for 
the verification, even though the state variable itself is necessary. In 
these cases, an abstraction T; of the bit transition relation such that 
(y = A;) < T < 1 would be ideal; this, however, is not possible when 
we use the “coarse-grain” abstraction approach. Unnecessarily including 
the irrelevant information can make the abstract model too complex for 
the model checker to deal with. 

We now give a finer grain approach for abstraction to push the ab- 
straction granularity beyond the state variable level. We consider not 
only the state variables but also Boolean network variables. Boolean 
network variables (BNVs) are the intermediate variables selectively in- 
serted into the combinational logic cones of latches to partition large 
logic cones so that a compact BDD representation of their transition 
relations is possible. Each Boolean network variable is associated with 
a small portion of the combinational circuit in its fan-in cone; similar 
to state variables, each BNV together with its associated area has an 
elementary transition relation. The transition relation of the entire sys- 
tem is the conjunction of all these elementary transition relations. The 
following example shows how fine-grain abstraction works. 

We use the example in Figure 3.1 to illustrate the difference between 
traditional (coarse-grain) abstraction and the new fine-grain abstraction 
approach. In Figure 3.1, there are ten gates in the fan-in combinational 
logic cones of the two latches. Variables y; and yg are the next-state 
variables, and z,...,25 are the present-state variables. Let A,, be the 
output function of Gate 9 in terms of the present-state variables and 
inputs; similarly, let A,, be the output function of Gate 10. A,, and 
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Figure 9.1. Illustration of fine-grain abstraction. 


Ay are also called the transition functions of Latch 1 and Latch 2, 
respectively. According to the definition given above, the bit transition 
relation of Latch 1 is, 


Ti = y1 € Ay, (21, £2, £3, £4) . 


Boolean network variables are a selective set of internal nodes in the 
fan-in combinational cones of state variables. To illustrate this, we insert 
4 BNVs, tı, t2, t3, and t4, into this circuit. We use à;; to represent the out- 
put function of the signal t;, but in terms of both present-state variables 
and BNVs (as opposed to present-state variables only). Similarly, we 
define for each BNV t; the elementary transition relation T+, = t; = ĝu. 
For the example in Figure 3.1, these new functions and transition rela- 
tions are defined as follows: 


bt; = Z 02% Tu = tho Oty 
Oto = —(z2 A 23) Qi Ti, = too Ó, 
Óts = —(z3 V z5) ^ X4 Te = tage Sts 
Ot, = “zo @tə Ta = t4 n 
by, = — (271 A t2) V — (74 A t1) Tea = yo by: 
Oy. = (za Ati) ^ta Ty, = Y2 fy 


The state variable yı is now associated with ô, instead of A,,. The 
bit transition relation of Latch 1 is a conjunction of three elementary 
transition relations 

Ti = Ty, ATi, A Th - 


In coarse-grain abstraction methods where only state variables are 
treated as atoms, when Latch 1 is included in the abstract model, all 
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the six fan-in gates (Gate 1, 2, 4, 5, 7, and 9) are also included; that is, 
T- Ty, ^Ti, AT. In the new fine-grain abstraction approach, BNVs as 
well as latches are treated as atoms, which means that when Latch 1 is 
in the abstraction, only those gates covered by the elementary transition 
relation Ty, are included. This is indicated in the figure by the cut $1, 
which contains Gate 5, 7, and 9. 

In the successive refinements, only the clusters of logic gates that are 
relevant to a set of refinement variables are added. In the next chapter, 
an algorithm to identify which variables should be included in the refine- 
ment set will be presented. Meanwhile, assume that the current abstract 
model is not sufficient, and t4 is added during refinement. The refined 
model is indicated by the new cut $2 in the figure, which corresponds to 
T = Ty, ^ Ti. The abstract model now contains Latch 1 and Gates 2, 
5, 7, and 9. Continuing this process, we will add yo, t4,... until a proof 
or a refutation is found. 

It is possible with the new fine-grain approach that gates covered by 
the transition cluster T}, (i.e., Gates 1 and 4) never appear in the ab- 
stract model, if they are indeed irrelevant to the verification of a given 
property. This demonstrates the advantage of fine-grain abstraction. 
In à couple of large circuits extracted from the PicoJava microproces- 
sor design [PIC], we have observed that over 9096 of the gates in some 
large fan-in cones are indeed redundant, even though the corresponding 
latches are necessary. 

The granularity of the new abstraction approach depends on the sizes 
of the elementary transition relations, as well as the algorithm used 
to perform the partition. The partitioning algorithm is important be- 
cause it affects the quality of the BNVs. In our own investigation, the 
frontier [RAB*95] partitioning method is applied to selectively insert 
Boolean network variables. The method was initially proposed in the 
context of symbolic image computation. It works as follows: First, the 
elementary transition function of each gate is computed from the com- 
binational inputs to the combinational outputs, in a topological order. 
If the BDD size of an elementary transition function exceeds a given 
threshold, a Boolean network variable is inserted. For all the fan-out 
gates of that gate, their elementary transition functions are computed 
in terms of the new Boolean network variable. Each Boolean network 
variable or state variable is then associated with a BDD ó;, or dy, for 
describing its transition function. 

Now we formalize the definition of a fine-grain abstract model. Let 
t = fti,...,tal denote the set of Boolean network variables. Assume 
that each variable tę is associated with an elementary transition relation 
Tt, = (ty > 01), and each state variable is associated with a fine-grain 
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bit transition relation Ty, = (yj — ô) Let J = {1,...,m} bea 
permutation of the indexes of state variables and K = {1,...,n} bea 
permutation of the indexes of BNVs. The concrete transition relation 
can be represented as 


Tut = Jon, ust) ^ A Ti, (£, w, t, t) Š 


ged kek 
Let the fine-grain abstraction consists of m! < m state variables and 
n’ < n Boolean network variables, i.e., J = {1,...,m’} C J and K = 
{1,...,n'} C K. Then, 
Tarau = A Ty G, (5, w Gy) ^ A mauka 
jej kc K 


Here £ = (z; | j € J) is the subset of present-state variables in the 
abstract model, j= —iyljeJ 7) is the subset of next-state variables, 
and í = {tp | k € K) is the subset of BNVs. All the remaining (invisible) 
present-state variables and BNVs (t V É) go into 4. We sometimes call 
the variables in  pseudo-primary inputs since they are treated as inputs 
during symbolic model checking. The initial predicate 7 is an existential 
projection of /—an abstract state is called initial if it contains a concrete 
initial state. 

We can fine tune the actual abstraction granularity by controlling the 
BDD size threshold during the frontier partitioning. In the extreme case 
such that the BDD size threshold is set to 1 (i.e., a Boolean network vari- 
able is created for every logic gate), the optimum deciding abstraction 
is, among all the possible final abstract models, the one with the fewest 
logic gates. This is interesting because the number of logic gates has 
been used as a main optimization criterion in logic synthesis [HS96]. In 
this sense, we have built a connection between abstraction refinement 
and logic synthesis; abstraction refinement can be considered as an iter- 
ative process of synthesizing the smallest abstract model that can prove 
or refute the given property. 


3.3 Abstract Counterexamples 


Counterexamples found in the abstract model may not be real paths, 
because some transitions that are responsible for the counterexamples 
may be forbidden in the concrete model. To check whether an abstract 
counterexample is real or not, we need to conduct a concretization test. 
Conceptually, a concretization test checking whether one can reconstruct 
the abstract execution trace in the concrete model. If the abstract coun- 
terexample trace cannot be reconstructed, it is called spurious. 
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When a property fails in the abstract model, there are often many 
possible counterexamples. Note that the number of counterexamples to 
a general LTL property can be infinite (e.g., when the counterexam- 
ples contain cycles). Even if one focuses on the counterexamples of the 
shortest length, the number of counterexamples can still be extremely 
large. 

We have observed, for instance, that the number of shortest coun- 
terexamples to an invariant property is 10% in a model with 100 binary 
state variables. This suggests that analyzing counterexamples one-by- 
one through enumeration is not a good strategy; at the same time, arbi- 
trarily choosing one counterexample from many is also *a-needle-in-the- 
haystack" approach, especially if one wants to use the counterexample 
to guide the computation of refinement variables. We believe that it is 
often advantageous to capture as many counterexamples as possible and 
to analyze them simultaneously. For safety properties, we can actually 
capture symbolically all the counterexamples of the shortest length. 


3.3.1 Synchronous Onion Rings 


All the shortest counterexamples can be captured by a data structure 
called the Synchronous Onion Rings (SORs). The SORs build upon the 
reachability onion rings, which have been used frequently in symbolic 
model checking. Consider checking an invariant property of the form 
Gp: forward reachability computation from the set J of initial states 
will produce a set of forward reachability onion rings (F9, ..., F n}, among 
which F represents the set of new states encountered at the i-th step 
during breadth-first search. For the state transition graph in Part (1) 
of Figure 3.2, for instance, the set of forward reachability onion rings 
is represented by F°, F!,..., F^. In particular, F! = {3,4,5} is the set 
of states that can be reached in one step from the initial states. States 
labeled ^p are first reached at the 372 step of the search. An analogous 
backward reachability analysis from the —p states in the 374 step would 
reach States (8,9) at the 2"4 step, {5} at the 1% step, and the initial 
state 2. That gives the set of backward reachability onion rings. Once 
the forward and backward reachability onion rings are available, the 
intersection of the two sets of states at each corresponding step gives 
the synchronous onion rings, 


S° = {2}, S'—(5), S° = (89), S?— {12} . 


The term “Ariadne’s bundle" is used to denote the subrelation T? 
induced by considering only the transitions between a state at one step 
to another state in the next step in the SORs. It comprises the bundle 
of all shortest counterexamples, and no other counterexample. 
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(2) Synchronous onion rings 


Figure 9.2. Ariadne’s bundle of synchronous onion rings. 


There is an interesting analogy between the abstract counterexamples 
and the magic Ariadne’s thread: in the Greek mythology, Theseus needs 
the thread to navigate through the Labyrinthus to kill the monster Mino- 
taurus; in abstraction refinement, one needs the guidance of abstract 
counterexamples to remove the “false negatives.” 

Compared to the abstract model, the state transition graph of the 
Ariadne’s bundle often has significantly less states and transitions. In 
this simple example, there are two shortest counterexamples, both of 
length 3. In practice, however, the number of counterexamples in the 
SORs is typically large. 


3.3.2 Multi-Thread Concretization Test 


Once all the abstract counterexamples are captured in the SORs, we 
need to check whether they are real paths in the concrete model. This, 
unfortunately, cannot be accomplished by conventional simulation even 
if a single abstract path is being reconstructed. This is because an 
abstract path may not have a complete set of assignments to all the 
input variables—one abstract transition often corresponds to multiple 
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concrete transitions. Therefore, the concretization test requires the use 
of symbolic simulation techniques. 

Various symbolic techniques have been proposed for concretization 
test. In [CGJ*00], for instance, BDD based image computation was 
used in the concrete model to reconstruct all the concrete paths inside 
a single abstract path. In [WHL*01], the search of a concrete path 
inside a single abstract path was performed by an ATPG (automatic 
test pattern generation) engine. In [CGKS02, CCK ^02], SAT solvers 
were used to perform the reconstruction. However, one thing is common 
to all these methods: concretization test deals with only a single abstract 
counterexample. 

We want to simultaneously concretize of all the shortest counterex- 
amples in the SORs. This multi-thread concretization test can be for- 
mulated into a Boolean satisfiability problem, which can be solved by 
a SAT solver. Given a set of rings in the SORs ($9,... , S"), the SAT 
problem can be defined as V = V4 A Vs, where 


Wa =s VOY. I BVO VE) 
0<;<L 
Vs = A SHUT. 


O<i<L 


Formula V 4 represents the unrolling of the concrete transition relation 
for exactly L time frames, and Vg represents the constraints coming 
from the abstract SORs. V* and U* are the set of state variables and 
the set of combinational variables at the i-th time frame, respectively. 
The predicate S*(V*) represents that states at the i-th time frame are 
in the i-th ring of the SORs. We give a graphic illustration of the multi- 
thread concretization test in Figure 3.3. 

Formula V is satisfiable if and only if a concrete counterexample exists 
inside the abstract SORs. When W is satisfiable, the set of assignments 
returned by the SAT solver represents a concrete path from an initial 
state to a —p state. 

In symbolic model checking, 5? initially is a BDD representing the 
set of states in the i-th ring of the SORs. To translate it into the CNF 
format, we need an encoding procedure that takes a BDD as input and 
produces a conjunction set of clauses. The translation, as illustrated in 
Figure 3.4, goes as follows: First, we translate the BDD into a combina- 
tional logic circuit by replacing every internal BDD node with a 2-to-1 
multiplexer. Since each multiplexer consists of 3 and gates, this trans- 
lation is linear in the BDD size. Once the and-inverter graph is built, 
encoding it into a CNF formula is straightforward. As we have explained 
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Figure 9.9. Multi-thread concretization test. 


in Section 2.5, the transition relation of each and gate can be encoded 
as three clauses, if we are allowed to add auxiliary variables. 

However, this linear encoding scheme requires the addition of a large 
number of auxiliary variables, one for each logic gate. An alternative 
approach is to enumerate all the minterms (or cubes) of the comple- 
mented function —Sš, and convert the minterms (or cubes) into CNF 
clauses. A minterm (or cube) of a Boolean function corresponds to a 
root-to-leaf path in its BDD representation. This encoding scheme has 
been used in [CNQ03]. Although no auxiliary variable is required in this 
approach, the number of root-to-leaf paths in à BDD can be exponential 
with respect to the number of BDD nodes. 


3.4 Further Discussion 


Since the introduction of the general abstraction refinement frame- 
work by Kurshan [Kur94], significant progresses have been made on the 
refinement algorithms and the concretization test, through the incor- 
poration of latest development of BDD and SAT algorithms [LPJ*96, 
JMH00, CGJ*00, CGKS02, CCK*02, GGYA03]. However, in most of 
these previous works, the abstraction granularity remains at the state 
variable level. The fine-grain abstraction approach described in this 
chapter is unique in the sense that it treats both state variables and 
Boolean network variables as abstraction atoms. With fine-grain ab- 
straction, the refinement strategies must search in a two-dimensional 
space. Refinement in the sequential direction is comprised of the addi- 
tion of new state variables only, which is typical of much of the pioneering 
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Figure 3.4. Translating a BDD into a combinational circuit. 


prior art of Clarke et al. [OGJ* 00, CGKS02]. Refinement in the Boolean 
direction is comprised of the addition of Boolean network variables only. 
Boolean network variables belong to a special type of cut-set variables 
in the circuit network. 

In [WHL ^01], Wang et al. proposed a min-cut model in abstraction re- 
finement to replace the conventional coarse-grain abstract model. They 
first defined a free-cut set of signal as those at the boundary of the tran- 
sitive fan-in and transitive fan-out of the visible state variables. They 
then computed a min-cut set of signals between the combinational in- 
puts and the free-cut signals; the logic gates above this min-cut were 
included in the reduced abstract model. Since the transition relation is 
expressed in terms of a smaller set of signals, it often has a smaller BDD 
representation. However, the abstraction granularity of this method is 
still at the state variable level. In particular, logic gates above the free- 
cut are always in the abstract model, regardless of whether or not they 
are necessary for verification. In [CCK*02], Chauhan et al. adopted a 
similar approach to simplify the coarse-grain model. In their method, 
the further reduction of abstract model was achieved by pre-quantifying 
pseudo-primary inputs dynamically inside each image computation step. 
This method shares the same drawback as that of [WHL'*01]. 
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In [GKMH*03], Glusman et al. computed a min-cut set of signals 
between the boundary of the current abstract model and the primary 
inputs, and included logic gates above this cut in the abstract model. 
Since an arbitrary subset of combinational logic gates in the fan-in cone 
of a state variable could be added, the abstraction granularity was at 
the gate level. However, there are significant differences between their 
method and the fine-grain abstraction. First, fine-grain abstraction aims 
at directly reducing the size of the transition relation by avoiding the 
addition of irrelevant elementary transition relations, while the method 
in [GKMH*03] aims at reducing the number of cut-set variables. Sec- 
ond, with fine-grain abstraction, we can differentiate the two refinement 
directions and can control the direction at each refinement iteration step, 
while the method in [GKMH*03] does not differentiate the two direc- 
tions. Third, in their method, logic gates added during refinement can- 
not be removed from the abstract model afterward — even if later they 
are proved to be redundant. Under fine-grain abstraction, the removal 
of previously added variables and intermediate logic gates is possible. 

In [ZPHS05], Zhang et al. proposed a technique called “dynamic ab- 
straction," which maintains at different time steps separate visible vari- 
able subsets. Their approach can be viewed as a finer control of abstrac- 
tion granularity in the time axis, since they are using different abstract 
models at different time frames. However, at each time frame, their 
abstraction atoms are still latches. Therefore, this approach is entirely 
orthogonal to our fine-grain abstraction. 

Applying BDD constraints to help solving the series of SAT problems 
in bounded model checking was studied by Gupta et al. in [GGW* 03a]. 
In their work, the BDD constraints were derived from the forward and 
backward reachability onion rings. They used such constraints primarily 
to improve the BMC induction proof by restricting induction to the over- 
approximate reachable state subspace (instead of the entire universe). 
In our multi-thread concretization test, we have used the same method 
as that of [GGW^03a] for translating BDDs into CNF clauses. An 
alternative encoding scheme from BDDs to CNF clauses was proposed 
by Cabodi et al. in [CNQ03]; their idea was enumerating all the minterms 
(or cubes) of the complemented BDD and converting each into a CNF 
clause. The same authors also proposed a hybrid encoding algorithm 
that combines the aforementioned two encoding schemes and tries to 
make a trade-off between them. 


Chapter 4 


REFINEMENT 


If the abstract counterexamples are all spurious, we need to refine the 
current abstraction by identifying a subset of currently invisible variables 
and restoring their bit transition relations. To improve abstraction ef- 
ficiency, it is crucial to identify those variables that have larger impact 
on removing false negatives. In most of the previous counterexample 
driven refinement methods [CGJ*00, WHL*01, CGKS02, CCK *02], re- 
finement variable selection was guided by a single spurious counterexam- 
ple. This is *a-needle-in-the-haystack" approach, since in practice the 
number of counterexamples tends to be extremely large. 


In this chapter, we propose a refinement algorithm driven by all the 
shortest counterexamples in the synchronous onion rings. The new algo- 
rithm, called GRAB [WLJ*03], does not try to remove the entire bundle 
of spurious counterexamples in one shot. Instead, it identifies critical 
variables that are in the local support of the current abstraction by view- 
ing refinement as a two-player reachability game in the abstract model. 
Due to the global guidance provided by the SORs and the quality and 
scalability of the game-based variable selection computation, GRAB has 
demonstrated significantly performance advantages over previous refine- 
ment algorithms—it often produces a smaller final abstract model that 
can prove or refute the same property. 


At the end of each refinement generation, i.e., a set of refinement 
steps that are responsible for the removal of spurious counterexamples 
of a certain length, we also apply a SAT based greedy minimization to 
the refinement set in order to remove redundant variables; the result is 
a minimal refinement set that is sufficient to remove the entire bundle 
of abstract counterexamples. 
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4.1 Generational Refinement 


In this section, we will illustrate the generic process of abstraction 
refinement by using the simple example in Figure 4.1. Here we consider 
the concrete model A as the synchronous (parallel) composition of three 
submodules: Mı, Mo, and M3. That is, 


A= M. || M> || Ms. 


Each component M; has one state variable vj. The state variable vı can 
take four values and thus must be implemented by two binary variables; 
the other two state variables (v9, v3) are binary variables. Variable va, 
which appears in the edge labels in Mi, is à primary input. We also 
assume that the property of interest is G(v Z 3); that is, State 3 in M. 
is not reachable in the concrete model .A. 

The right-hand side of Figure 4.1 is the state transition graph of the 
concrete model. It shows that the given property fails in the concrete 
model, as shown by the bold edges; the concrete counterexample if of 
length 4: (000, 111, 200, 211, 300). 


Ma(v2) 9 Ma(vs) 9 =p 
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pus ap 
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73 A Us V V2 A ^4 
010 = 
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Figure 4.1. An example for abstraction refinement. 


The initial abstract model is A = Mj, which preserves only the state 
variable appearing in the given property, and all the other state variables 
are treated as pseudo-primary inputs. There is an abstract counterex- 
ample of length 3: (0.., 1.., 2.., 3..). This counterexample is spurious 
because it is not concretizable in A; in particular, no direct transition is 
possible from 200 to 3. .. 

Although this example is simple, it demonstrates an important aspect 
of the abstraction refinement process. Refinement algorithms like those 
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in [CGJ*00, CGKS02], since they are actuated by a single counterex- 
ample, may pick only variable v2 for refinement. However, after this 
refinement, an abstract counterexample of the same length still exists— 
for instance, it can be (00. , 10. , 21., 30.). Therefore, the refinement 
set {v2} is not a sufficient set since it does not remove the abstract 
counterexample (0. ., 1.., 2.., 3_-), although it can separate the 
set of deadend states (200) from the set of bad states {211,210}. (In 
[CGJ* 00], deadend states are the concrete states inside an abstract state 
8; that are reachable from the initial states but can not reach any con- 
crete states in §;41; bad states are the concrete states inside 3; that can 
reach some concrete states in $;,1.) This is suggestive of the danger of 
placing too much refinement emphasis on a single abstract counterex- 
ample. Of course, it is much more of a problem when the SORs contain 
an extremely large number of counterexamples. In the presence of many 
counterexamples, computing the set of refinement variables based on one 
arbitrarily chosen counterexample can be ineffective. 

We now illustrate the SOR based generational refinement framework 
using the above example. In building the SORs, we have pruned away 
self loops and back edges to focus on the shortest counterexamples in 
the current abstract model. The initial SORs are shown in Part (a) 
of Figure 4.2. When A= Mi, the SORs contain just the four states 
of Mı, which are connected by the four forward edges. Since the first 
generation of shortest counterexamples is of length 3, the refinement 
process starts by dealing with the SORs of length 3 until all paths in 
them are killed. Note that in M; only edges from State 2 to States 1, 2, 
and 3 are labeled. We will discuss in Section 4.2 that these edge labels 
cause the variable selection routine to pick v2 for the first refinement 
iteration, and as a result, the refined model is A = M. || M>. However, 
A is not constructed in the naive way of building the transition relation 
for Mı and M» first and then conjoining them together, but by a more 
efficient two-step process. 

First, we split the states of Mı according to the labels on their out- 
going edges, as shown in Part (b) of Figure 4.2. Because of the label 
v2 ^ 4, the last abstract edge (2...,3..) is split into two rather than four 
refined edges. State 20. is now backward unreachable from ~p, so the 
two incoming edges, (10.,20.) and (11..,20.), are removed. The outgoing 
edges from the state 01. are removed because the state 01. is not an ini- 
tial state. States like 20. are called the deadend states. The concept of 
deadend states is critically involved in the refinement variable selection 
algorithm, as discussed below in Section 4.2. Note that all splits that 
make the SORs change from the one in Part (a) to the one in Part (b) 
are done before M» is brought in. The second step is to actually take the 
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Figure 4.2. "The generational refinement process. 
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composition of M» with the remaining edges of the SORs. This removes 
the edges (11. ,21.) and (21. ,31.), and leads to the reduced SORs in 
Part (c). 

After the above refinement step, the number of length-3 spurious coun- 
terexamples is decreased, but they have not been removed completely. 
Now v3 is selected as the next refinement variable. We then proceed to 
again take the first part of the two-step refinement process, as illustrated 
in Part (d). The result is a disconnection of T from ~p, because there is 
no outgoing edge from the sole remaining initial state. At this point, it 
has been proved that no concrete counterexample of length 3 exists, so 
this generation of refinements is complete. 

During the two refinements in the first generation, i.e., adding vp and 
adding v3, the SORs are updated incrementally since each ring of the 
SORs is a subset of the corresponding ring of the previous SORs. The 
BDD don’t cares associated with this incremental process lend critical 
efficiency to the SORs refinement process. 

Next, we build from scratch the new SORs, which are of length 4 
as shown in Part (e). This final set of SORs contains a single coun- 
terexample that is concretizable in A, as discussed above in reference to 
Figure 4.1. Therefore, the given property fails. 
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Figure 4.8. The effect of generational refinement, with refinement minimization. 
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'To summarize, the proposed generational refinement algorithm does 
not try to remove all the spurious abstract counterexamples in one shot. 
Instead, it identifies local variables that are critical to refinement by 
exploiting the global guidance provided by the synchronous onion rings. 
It may take a set of refinement steps, called à generation of refinements, 
to remove all the abstract counterexamples of a given length. 


The effect of generational refinement is illustrated in Figure 4.3. The 
data are obtained from a real-world example in which the given invari- 
ant property holds in the concrete model. The upper curve represents 
the number of state variables in the abstract model at the different re- 
finement steps, and the lower curve represents the length of the abstract 
counterexample (ACE). Each flat segment of the lower curve corresponds 
to a generation of refinement steps. A generation consists of a number 
of consecutive refinement steps, all with SORs of the same length. Note 
that within the same generation, the size of the abstract model keeps 
increasing; every time the length of the shortest abstract counterexam- 
ple changes (between generations), the number of abstract variables may 
decrease, due to the greedy refinement minimization procedure that tries 
to keep the abstraction as small as possible by removing redundant vari- 
ables. Our experience shows that this greedy minimization is critical in 
achieving a high abstraction efficiency. 


4.1.1 The GRAB Algorithm 


We now give the pseudo code of our abstraction refinement algorithm 
in Figure 4.4. The algorithm is called GRAB for Generational Refinement 
of Ariadne's Bundle. GRAB accepts as inputs a concrete model A and 
a property ® of the form Gp. 


The initial abstract model A contains only those state variables that 
appear in the local support of the property. In Figure 4.1, for example, 
the initial abstract model contains only variable vi, because the property 
is G(v, Z 3). Let (S9, $1,..., SL) be the length-L synchronized onion 
rings, where S? is a subset of initial states, S” is a subset of states 
satisfying ^p, and S/ is a set of states on the shortest abstract paths 
from S? to St. 


The outer loop is over the length L of the current generation of SORs. 
With the abstract model being gradually refined, L is guaranteed to grow 
monotonically in the outer loop. The action starts in Line 3, where BDD 
based forward reachability analysis is used to compute the forward onion 
rings from the initial states to —p states. If ^p cannot be reached in A, 
it cannot be reached in A either. In this case of early termination, GRAB 
returns true. Otherwise, the first set of SORs is computed. 
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GRAB(A, ®) 
{ 


A = INITIALABSTRACTION(A, 6); 
while (true) //Loop over SORs with different length 


{S'} = compureSORS(A, 6); 
if ( (S!) is empty ) 
return true; 


CCE = MULTITHREADCONCRETIZATION(A, {S"}); 
if (CCE not empty) 
return (false, CCE); 


{SR} = {S}; 
while (true) //Loop at the current length 


A = REFINEABSTRACTION(A, (SL.)); 


(SL) = REDUCESORs(A, {Sh}; 
if (( Sl) is empty) 
break ; 
} 


A = REFINEMENTMINIMIZATION(A, ( S!)); 
l 
REFINEABSTRACTION(A, (S')) 


ws = { }, wg = d; 
while (|ws| < threshold) 


v = PICKBESTVAR(A, (5!)); 
ws = ws U (v), wg = we \ {v}; 
j 


return COMPUTEABSTRACTION(A, wg) ; 


Figure 4.4. The GRAB abstraction refinement algorithm. 


62 


A SAT based concretization test is then performed in the concrete 
model. Here, it simultaneously tries to concretize all the abstract coun- 
terexamples in the SORs by one satisfiability check. If any of these 
counterexamples can be concretized, the property ® is proved to be 
false and the concrete counterexample (CCE) is returned. If no concrete 
counterexample exists, we start the inner loop over the refinements in 
this generation. 

In the inner loop, although the number of abstract counterexamples 
in the SORs decreases monotonically, the length of the SORs does not 
change. Since all the abstract counterexamples have been proved spu- 
rious at the very beginning, no concretization test is needed inside this 
loop. We use {S4} to represent the set of “reduced SORs.” Each time 
the abstract model is refined, the synchronous onion rings are reduced 
(Line 12) until all the spurious counterexamples disappear. Typically 
a few passes through the inner loop produce the break-out, which im- 
plies that the set of refinement variables added in the current generation 
constitutes a sufficient set—a set of refinement variables, when added, 
remove the entire length- L, SORs. 

Termination of the GRAB procedure is guaranteed by the finiteness 
of the model. The game based heuristic for picking refinement variables 
will be presented in the next section, followed by a SAT based greedy 
minimization of the refinement set. 

Prior art in abstraction refinement algorithms [CGJ*00, CGKS02, 
CCKt02] can also be described with a similar framework of pseudo 
code, however, these algorithms are all based on the analysis of a single 
counterexample. As we have pointed out earlier, even an optimal re- 
finement algorithm based on a single counterexample cannot necessarily 
guarantee a good overall refinement. We will compare GRAB with these 
alternative methods later in the experimental results section. 


4.2 Refinement Variable Selection 


We consider the refinement variable selection problem as a two-player 
reachability game in the abstract model. Given the abstract model .A 
and a target predicate =p, the model checking of Gp with respect to 
a model A can be viewed as a two-player concurrent reachability game 
[EJ91, JRS02]. The two players of this game are the hostile environ- 
ment and the abstract system; they play by controlling the values of the 
pseudo-primary inputs of the abstract model. In A, the pseudo-primary 
inputs, denoted by Z, are the set of invisible variables. We need to 
partition the set Z£ into two subsets, £ = wg U wg, among which wg 
is controlled by the environment (player), and wg is controlled by the 
system (player). 
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The positions of the game correspond to the states of the abstract 
model. Let X, a valuation of the set of present-state variables £, be 
a position (similarly, let capital values of other vector names stand for 
their valuations). From one position X , the environment chooses values 
for the variables in wg and simultaneously the system chooses values for 
variables in ws. The new position is the unique Y determined by the 
abstract transition relation T(X, X,Y). Note that we are assuming that 
the model (T, Í) is a closed system. This is not a problem at all since 
any open system can be transformed into a closed system by treating 
inputs as free state variables. The goal of the environment is forcing the 
abstract model to go through spurious paths and reach a state labeled 
^p in spite of the opposition of the system. A (memoryless) strategy for 
the environment is a function that maps each state of A to one valuation 
of the variables in wg. Likewise, a strategy for the system is a function 
that maps each state of A to one valuation of the variables in wg. 

To relate this reachability game to our refinement problem, we con- 
sider the environment the hostile player (and we want the system to 
win). Next, we define the winning positions for the hostile environment. 


DEFINITION 4.1 A position X in Aisa winning position for the hostile 
environment if there exists an environment strategy such that, for all 
system strategies, ap is eventually satisfied. 


The concept of winning position is closely related to the refinement prob- 
lem. Before the abstract model is refined, there are spurious paths from 
the initial states to states labeled —p. This corresponds to the partition 
(wg = £,ws = í })—the hostile environment controls all the invisible 
variables. Assume that A is deterministic, then the environment always 
has a winning strategy because it can force any transition by controlling 
all the variables in Z. In refinement, our task is to remove some variables 
from wg and to put them into wg. We want to identify a small subset 
of variables that, once being removed from wz, will significantly reduce 
the number of winning positions for the hostile environment. 

Therefore, the refinement problem can be stated as follows: among 
all the possible partitions of £ = wp U wg, choose the one that gives the 
environment the least number of winning positions. Once the partition 
is identified, variables in wg together with their elementary transition 
relations are added into the abstract model. In this two-player reacha- 
bility game, the partition that favors the hostile environment the least 
also favors the abstract system the most. 

Given an input variable partition $ — = [wg ws} and the spurious 
counterexamples in the SORs (5, S}, ..., SÍ, ..., S^), the environment’s 
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winning positions inside $7 are computed as follows: 
Jw. Yws. 39. [Si (£) ^ T(2, 2, g) ^ S1 *1(j)] , 


which is the subset of S7 states from which the environment can force 
the transition to S7*! despite the opposition of the system. 

The normalized number of winning positions for the hostile environ- 
ment inside SÍ is computed as follows: 


Nivest _ [Jwg.Vws.3.[9? (£) ^ T(£, i, $) ^ Sš+1(g)]| 
Í i. |S9(z)| l 





Given a set S, we use |S| to denote the cardinality of the set. The nor- 
malized number of winning positions, denoted by Nj, is a good indicator 
of the impact of refining with respect to the variables in wg. For the 
purpose of refinement, we prefer the partition that gives N; the lowest 
value. 

We use universal quantification (V) to mimic the impact of parallel 
composition on the abstract model, since both operations reduce the 
number of enabled edges. It can be shown that when an edge label has 
an essential variable—a variable which factors out of its label (all the 
edges in Figure 4.5 except the edge from state 5 to state 7), composing 
that variable with the abstract model splits the abstract edge into two 
edges (instead of four). Furthermore, among the two new tail states 
created by the splits, one has no fan-out—that is, it is a deadend split. 








Figure 4.5. Illustration of the winning positions. 


The abstract model in Figure 4.5 has S? = {1,2,5,6}, S! = {3,4}, 
and £ = {g, f}. When the partition of ž is such that wg = {g} and 
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ws = {f}, the set of winning positions for the hostile environment is 
{1,2}. State 1 is a winning position because when the hostile environ- 
ment makes the assignment g = 1, the system player will be forced to a 
ap state (either 3 or 4) no matter what value it assigns to f. A similar 
argument applies to State 2 as well. According to the definition of N}, 


Nit fU) E 1.0, 
NÍGPUD = 0.5, 


NEHOD = 0.25, 
NED = 0.0. 


The result of this calculation indicates that g is a better candidate than 
f for refinement, because putting g alone in wg gives the hostile envi- 
ronment one winning position, while putting f alone in wg gives it two 
winning positions. 

A further explanation of the heuristic via state splitting is shown by 
the two examples in Figure 4.6 and Figure 4.7. In the first example, g is 
an essential variable to the label on the spurious transition 2 — 4, and 
f is an irrelevant variable. A variable v is essential to a function f(v) if 
and only if either f(0) = 0 or f(1) = 0. By intuition, one would prefer 
refining with g, because f is irrelevant. This is the right choice because 
it will split State 2 into two new states, (29,2) and (g,2), only one of 
which has an outgoing edge to state 4. In other words, it is possible 
to remove this spurious edge—in the case when State (9,2) becomes 
unreachable after refinement. Refining with f, however, does not have 
such an impact since both of the two new states, (/,2) and (f,2), will 
have outgoing edge to State 4. This is consistent with the game based 
analysis—State 2 is a winning position for the hostile environment if it 
controls g. 

In the second example (Figure 4.7), it is no longer straightforward to 
figure out that g is a better refinement candidate than f, because both g 
and f appear in the edge labels. However, the game based analysis tells 
us that State 1 is a winning position for the hostile environment if it 
controls g. Refining with g produces a similar deadend split—only one 
of the two new states has an outgoing edge to the next ring. Therefore, 
it is possible to remove this spurious edge—in the case that State (g,1) 
becomes unreachable after refinement. Refining with f, however, leaves 
the spurious edges intact, since both of the two new states have outgoing 
edges to the next ring. This is also consistent with the game based 
analysis—there is no winning position for the hostile environment if the 
abstract system controls g. 
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Refine with variable g Refine with variable f 


Figure 4.6. An example for state splitting: g is a better refinement candidate. 


To summarize, our refinement algorithm selects a small subset of in- 
visible variables into wg such that the partition {wz, ws} minimizes the 
following number, 


SOAN i vpug ws) . 


0€jsl 


This optimization is greedily approximated inside REFINEABSTRACTION: 
the one variable that minimizes the above number is repeatedly picked 
(Line 19 in Figure 4.4). 

The computation of winning positions is similar to BDD based pre- 
image computation. A normal pre-image would have Jwg. dwg. 3j. 
instead of Jwg. Vwg. Iĝ. . Since S3(£) does not have any quantified 
variable in its support, we can pull it out of the quantification operators. 
Furthermore, the following common intermediate result 


30.[P(e, 3, 9) A Si+1(g)] 


can be shared among various partitions of ž. Combining these simpli- 
fications with the conventional early quantification techniques, we can 
make the computation of N; very efficient. Another thing we would like 
to point out is that wg U wg contains only invisible variables that are 
in the local support of the current abstract model, not necessarily the 
entire set of invisible variables. 
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Refine with variable g Refine with variable f 


Figure 4.7. Another example for state splitting: g is still a better refinement candi- 
date. 


The refinement method in [GKMH *03] also used more than one coun- 
terexample. It was based on the classification of invisible variables into 
strong 0/1 signals and conditional 0/1 signals. A strong 0/1 signal was 
defined as in all counterexamples, the value of the signal at the given 
step of the trace is always 0 or always 1, respectively. In other words, 
only when a variable is essential with respect to the labels of all the 
abstract edges from S? to Sj+l will it be classified as a strong 0/1 
signal. Otherwise, it will be classified as a conditional 0/1 signal. In 
practice, however, strong 0/1 signals as defined in [GKMHt03] are very 
rare cases. In fact, neither f nor g in Figure 4.5 is a strong 0/1 signal; 
according to [GKMHt03], both would have been be classified as con- 
ditional 0/1 signals, and therefore would have been assigned the same 
weight in refinement variable selection. In contrast, GRAB can tell that 
g is actually a better refinement candidate than f. In general, GRAB is 
often more accurate in identifying important refinement variables. 


4.3 Keeping Refinement Set Small 


In fine-grain abstraction, there are two types of elementary transition 
relations: one is associated with state variables, while the other is associ- 
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ated with Boolean network variables. The addition of these variables to 
the abstract model indicates two different directions for the refinement. 
In the sequential direction, we only add state variables (or latches), which 
results in à potentially larger state space in the refined model. In the 
Boolean direction, we only add more logic gates in the fan-in cones of 
the visible state variables, which means that the state space will stay 
the same but some transitions will be removed. Our experience shows 
that if one makes no distinction between these two types of variables, 
the final abstract model may contain many redundant state variables. 
This suggests that the variable selection procedure needs some guidance 
on the proper refinement direction. 


4.3.1 Choosing The Refinement Direction 


At every refinement step, the proper refinement direction needs to 
be predicted. If going in the Boolean direction (i.e., without adding 
any state variable) can remove the spurious counterexamples, then the 
Boolean direction should be chosen to avoid a potentially larger state 
space. À satisfiability check similar to concretization test can be used 
to predict the refinement direction. We can formulate the SAT problem 
as a constrained BMC instance such that it is satisfiable if and only 
if adding the complete fan-in cones of all visible state variables makes 
the instance unsatisfiable. This SAT problem differs from multi-thread 
concretization test in that an eztended abstract model, instead of the 
concrete model, is unrolled exactly L time frames. 


DEFINITION 4.2 Given a fine-grain abstract model (T, J. the extended 
abstract model (T., T) is defined as the one that contains all the visible 
state variables of the fine-grain model and the complete fan-in logic cones 
of these state variables. 


Refer to Figure 3.1, when the current (fine-grain) abstract model con- 
tains Latch 1 and Gates 5, 7, and 9, the extended abstract model contains 
Latch 1, and Gates 1, 2, 4, 5, 7, and 9. 

Let T; be the transition relation of the extended abstract model; then, 
the refinement direction can be decided by solving the SAT problem of 
WS = Ve A Vs, defined as follows: 


Wea W(X) A fO wx), 
OxicL 
Us= A sv’). 
0<;<L 


Formula Vg enables only paths of length L that are allowed by the 
extended abstract model, and Wg enables only paths embedded in the 
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abstract SORs. If V* is unsatisfiable, it means that the abstract coun- 
terexample in the SORs do not exist in the extended abstract model. In 
other words, it is possible to remove all the length-L counterexamples 
by just adding some logic gates that are in the fan-in cones of the visible 
state variables; in this case, we opt for the Boolean direction. On the 
other hand, if V€ is satisfiable, it means that even adding all the logic 
gates in the Boolean direction cannot remove the spurious counterexam- 
ples. In this latter case, more state variables need to be added by going 
in the sequential direction. 


4.3.2 Minimizing The Refinement Set 


Once the entire set of spurious counterexamples in the SORs are re- 
moved, all the newly added variables forms a sufficient refinement set— 
that is, they are enough to remove all the length L spurious counterex- 
amples. However, this refinement set may not be minimal. In previous 
work [WHL*01, CCK* 02], a trial-and-error based greedy minimization 
has been used to remove redundant variables from the refinement set. 
This kind of greedy minimization can also be applied here in GRAB. 
With fine-grain abstraction, however, the minimization must be applied 
in both refinement directions, and with respect to the entire SORs in- 
stead of a single counterexample. 

Recall that for the spurious counterexamples of a certain length, the 
GRAB refinement is performed first in the sequential direction. As soon 
as a sufficient set of state variables is added, it is minimized with respect 
to the entire bundle of counterexamples before refinement shifts to the 
Boolean direction. When a set of state variables is being minimized, the 
extended abstraction model induced by these state variables is unrolled 
to form the SAT formula W° (referred to the previous section). We go 
through all the state variables in the refinement set, one at a time, to 
check if any of them is redundant. This is done by temporarily removing 
a state variable from the abstract model and check whether 4* is still 
unsatisfiable. Every time a state variable is removed from the refinement 
set, all the Boolean network variables that are relevant only to this state 
variable are also pruned away. If removing a state variable does not cause 
any spurious counterexample to reappear, then the variable is redundant 
and will be dropped permanently. However, if after removing a variable, 
the new formula V* becomes satisfiable, we need to add the variable 
back and proceed to the next variable. 

Note that a sufficient set of state variables only means that the ab- 
stract counterexamples do not appear in the extended abstract model 
(T), but they may still exist in the fine-grain abstract model (T). Af- 
ter the greedy minimization in the sequential direction, we will stay in 


70 


the current refinement generation and switch to the Boolean direction. 
After refinement shifts to the Boolean direction, only Boolean network 
variables will be added until the same set of SORs is removed again. 
At this point, the set of newly added Boolean network variables is also 
a sufficient set and will be greedily minimized. The minimization pro- 
cedure for BNVs is similar to that for state variables, with the only 
difference being that now we are using the fine-grain abstract model 
(with T instead of Te). 

We define the greedy refinement procedure more formally as follows: 


DEFINITION 4.3 Given a sufficient set of refinement variables and the 
SORs, the refinement minimization problem can be defined as finding 
the minimal subset of refinement variables that can remove the spurious 
counterezamples. 


Our implementation of the greedy minimization uses a SAT solver and 
the satisfiability checks are similar to the multi-thread concretization 
test. BDDs in the SORs are translated into CNF formulae, as described 
in the previous chapter, to constrain the SAT problem. Note that in 
concretization test, the SAT formula comes from the unrolling of the 
concrete model, and it captures all the length-L paths allowed by the 
concrete model. In refinement minimization, the SAT formula comes 
from the unrolling of an abstract model (extended abstract model for 
minimizing state variables, and fine-grain abstract model for minimizing 
Boolean network variables). Since the satisfiability checks are conducted 
in the abstract models, which can be arbitrarily smaller than the concrete 
model, these SAT problems are usually much easier to solve. 


4.4 Applying Sequential Don't Cares 


Previous work in abstraction refinement often divides the set of vari- 
ables (state variables and BNVs) of the concrete model into two parts: 
a, set of visible variables and a set of invisible variables. Model check- 
ing is applied to the abstract model that contains only the elementary 
transition relations of visible variables. The elementary transition rela- 
tions of invisible variables, on the other hand, are completely ignored. 
Since their transition constraints are removed, the invisible variables are 
treated as inputs during model checking—they can take arbitrary val- 
ues at all times. This explains why counterexamples may exist in an 
abstract model but not in the concrete model. The valuations of some 
of inputs that are responsible for triggering these counterexamples may 
not be allowed in the concrete system. 

In this section, we show that with some additional analysis of the 
invisible part of the system, we can further constrain the valuations of 
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Abstract, Model Remaining Submodules 


Figure 4.8. Sequential Don't Cares from remaining submodules. 


these invisible variables. As illustrated in Figure 4.8, the invisible part 
of the system is further decomposed into a series of submodules, each 
of which contains a subset of the invisible latches. We then perform an 
over-approximate reachability analysis of the set of submodules. Ap- 
proximate reachable states of the invisible part can be computed by 
first analyzing each submodule in isolation, assuming that the other 
submodules are in any states that have already been estimated to be 
reachable, and then propagating the result to other submodules to im- 
prove the reachability analysis on them. If there are circular connections, 
the reachability analysis will be iterated Machine-By-Machine (MBM) 
[CHM* 96a, MKSS99] until a fixpoint is reached. 

The set of approximate reachable states of the invisible part is an 
upper bound on the set of exact reachable states. If certain valuations of 
the invisible variables are not even in the set of approximate reachable 
states, they will never appear in the original system. Therefore, this 
set can be used to constrain the behaviors of the invisible variables, or 
pseudo-primary inputs, of the abstract model. Conceptually, constraints 
can be added by conjoining the set of approximate reachable states with 
the transition relation of the abstract model. During symbolic model 
checking of the abstract model, certain valuations of the pseudo-primary 
inputs will be disabled. 
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In the current implementation, we apply the machine decomposition 
algorithm as suggested by [CHM*96b] to the entire model, and then use 
the LMBM [MKSS99] approximate state space traversal algorithm to 
compute the approximate reachable states. We compute this set of ap- 
proximate reachable states only once before the abstraction refinement 
loop starts. Inside the abstraction refinement loop, we use approximate 
reachable states at every refinement iteration step to constrain the for- 
ward reachability analysis of the abstract models. Specifically, the BDD 
operation constrain [CM90] is used to remove spurious transitions from 
T, by using the approximate reachable states as the care set. This often 
results in a smaller BDD representation than directly conjoining the care 
set with T. 

Constraints on the behavior of the abstract model due to the neighbor- 
ing submachines prevent some spurious abstract counterexamples from 
occurring, leading to the decision of a property potentially earlier in 
the refinement cycle. A more systematic integration of machine de- 
composition and approximate reachability analysis into the abstraction 
refinement loop is possible. The result will be a multi-way partition 
refinement process. Partitioning of the model into submachines can be 
done so that the abstract model is one of the many submachines. In 
this new paradigm, refinement will be considered as merging the ab- 
stract model with some other submachines. We leave this extension as 
an interesting future work. 


4.5 | Implementation and Experiments 


The GRAB algorithm and two competing refinement algorithms have 
been implemented in VIS-2.0 [Bt 96, VIS]. In our implementation, CUDD 
was used for BDD related computations and Chaff [MMZ^*01] was used 
as the back-end SAT solver. Our experiments were conducted on 26 
hardware verification test cases coming from both industry and the VIS 
verification benchmarks [VVB]. The D-designs were kindly provided by 
the authors of [CCK*02]. All the experiments in this section were run 
under Linux on an IBM IntelliStation with a 1.7 GHz Intel Pentium 4 
CPU and 2 GB of RAM. 


4.5.1 Comparisons of Refinement Algorithms 


We first compare two variants of the GRAB algorithm against the de- 
fault invariant checking algorithm in VIS (CT), VIS's Bounded Model 
Checking (BMC), the SepSet algorithm [CGKS02], a variant of SepSet 
called SepSet--, and the conflict analysis algorithm of [CCK*02]. The 
results are presented in Table 4.1. The CI experiments consist of forward 
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reachability analysis with early termination. For BMC, only the times 
for failing properties are reported. (BMC in VIS checks for simple in- 
ductive invariants, but none of these invariant properties can be proved 
by simple induction.) The variant of GRAB, denoted by GRAB, does 
not perform refinement minimization. The variant SepSet+ differs from 
SepSet in that it minimizes the number of variables in the separation 
set, instead of the size of the separation tree [CGKSO02]. In this section, 
we focus on comparing the performance of the various refinement algo- 
rithms only. For the purpose of this controlled experiment, the same 
coarse-grain abstraction and concretization test procedures were used 
for all abstraction and refinement methods. 

Each model checking run was limited to 8 hours. Dynamic variable 
reordering was enabled (with method sift) for all BDD operations. In 
Table 4.1, the first column lists the names of the test cases. The sec- 
ond column lists the number of binary state variables in the cone of 
influence (COI) of the property. The third column shows the length of 
the counterexample, or of the last ACE encountered by GRAB if the 
property holds (indicated by a T). CPU times are in seconds and are 
all-inclusive. For each of the abstraction refinement methods compared, 
ite is the number of refinement iterations; reg is the number of state 
variables in the proof or refutation. If an experiment ran out of time, 
the number of iterations performed up to that point and the number of 
state variables in the last abstract model are given in parentheses. For 
GRAB we also report sat, the time spent in the SAT solver during ACE 
concretization. Note that in GRAB ite can be larger than reg because 
of refinement minimization. 


Both variants of the GRAB algorithm significantly outperform CI, 
SepSet, and CA in terms of CPU time. BMC has the best times for 
several failing properties, but is slow for the hardest problems and fails 
for the passing properties. Note that the last two properties cannot 
be proved or refuted by any method. Regarding the size of the BDDs 
used through verification, GRAB is much more efficient than CI; SepSet 
and CA have even fewer BDD nodes, because they use the SAT solver 
(instead of BDDs) to compute the refinement; unlike GRAB, they do not 
need backward reachability analysis. BMC uses no BDDs. 

Table 4.2 compares the final abstractions of GRAB and CA. In the 
table, g denotes the cardinality of the final set of state variables pro- 
duced by GRAB, while c denotes the cardinality of the final set of state 
variables produced by CA. The first three columns are repeated from 
Table 4.1. Table 4.2 shows that in general there is very good correlation 
between the final abstractions produced by CA and GRAB. In the 22 
experiments that both methods completed, GRAB and CA produced the 
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same final abstraction in three cases. In another 10 cases, the abstrac- 
tion produced by GRAB is strictly better than the one of CA. Conversely, 
in two cases, CÀ produces an abstraction that is strictly better than the 
one of GRAB. These differences are in part a consequence of applying re- 
finement minimization once every outer iteration step in GRAB, instead 
of once for every single counterexample in CA. The other sources of dif- 
ference are the order in which variables are selected for refinement (this 
is what happens in D24-p2) and the order in which they are considered 
by the greedy minimization procedure. 

Although we exercised diligence in implementing the algorithms of 
[CGKS02, CCK *02], there remain differences between the originals and 
the rewritings. For instance, we used the coarse grain approach when 
comparing various refinement methods. This is not the case of the orig- 
inal methods of [CCK*02], and will in some cases impede the search 
for a good abstraction. However, in this set of controlled experiments, 
the drawback is shared by all methods we implemented, and therefore 
should not have a major impact on the comparison we present. 

Further evidence for the importance of global guidance, i.e., SOR 
guided vs. single counterexample guided, is provided by an analysis of 
abstraction efficiency for 80 medium size invariant checking problems 
from the VIS Verification Benchmarks [VVB]. Each test case in this ex- 
periment has a passing property and a non-trivial abstract model (it 
requires at least one refinement iteration step). The abstraction eff- 
ciency is 0 (100%) if the final model contains all (no) state variables. 
Figure 4.9 shows scatter plots of the abstraction efficiency of SepSet, 
CA, and GRAB. Note that each point below the diagonal represents a 
win for GRAB. SepSet-- behaves like SepSet. Scatter plots for the other 
pairs of methods (not shown) show no clear winner. 

Refinement minimization, though essential for good performance of 
CA, does not always improve CPU time when applied to the proposed re- 
finement scheme. The time spent checking the variables for redundancy 
and the additional iterations are not always offset by the reduction in 
the size of the abstraction. Nonetheless, as one proceeds toward larger 
models, refinement minimization adds to the robustness of the method. 


4.5.2 Experiments with Fine-Grain Abstraction 


Experiments were also conducted to test the effectiveness of fine-grain 
abstraction and the use of sequential don't cares. In the experiments, we 
set the BDD size threshold to 1000 for frontier partitioning. Therefore, 
every time the BDD size of the transition function went beyond this 
threshold, a Boolean network variable was inserted in the combinational 
logic cones. 
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Figure 4.9. Comparing the abstraction efficiency of different refinement algorithms: 
(1) GRAB vs. SepSet; (2) GRAB vs. CA. 
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The first four columns of Table 4.3 repeat the statistics of the test 
cases: the first column shows the names of the designs; the second 
and third columns give the numbers of binary state variables and logic 
gates in the cone of influence, respectively. The forth column indicates 
whether the properties are true (T) or false (F). If the properties are 
false, the lengths of the shortest counterexamples are given. The follow- 
ing six columns compare the performance of three different implementa- 
tions: GRAB using the coarse-grain abstraction, +FINEGRAIN using the 
fine-grain abstraction method, and +ARDC using fine-grain abstrac- 
tion plus the use of sequential don’t cares. The underlying algorithm 
for picking refinement variables is the same game-based strategy for the 
three methods. For each method, the CPU time in seconds and the 
number of state variables in the final abstract model are compared. 


The fine-grain abstraction approach shows a significant performance 
improvement over GRAB. First, it is able to finish the two largest test 
cases that cannot be verified by GRAB. Careful analysis of TU-p1 and 
IU-p1, two problems from the instruction unit of the PicoJava micro- 
processor, shows that some of their registers have extremely large fan- 
in combinational logic cones. Without fine-grain abstraction, abstract 
models with only 10 registers would have been too complex for the model 
checker to deal with. For the other test cases that both methods man- 
aged to finish, +FINEGRAIN is significantly faster than GRAB. In fact, 
the total CPU time required to finish the 24 remaining test cases is 
12,207 seconds for GRAB, and 7,562 seconds for +FINEGRAIN. 


With the use of sequential don't cares, the performance of +FINE- 
GRAIN is further improved. +ARDC is significantly faster than both 
+FINEGRAIN and GRAB on more than half of the 26 test cases, and is 
also comparable for the remaining ones. The total CPU time required to 
finish all the 26 test cases is 10,724 seconds for +FINEGRAIN and 8,130 
seconds for --ARDO; this is an average of 25% speed-up. 


Figure 4.10 shows the allocation of CPU time among the different 
phases of abstraction refinement. The data were extracted with +ARDC; 
therefore, they correspond to the last column in Table 4.3. The four fig- 
ures at the left-hand side give, in percentage, the CPU time spent on 
reachability analysis, on computing the SORs, on the multi-thread con- 
cretization test, and on computing the refinement with GRAB, respec- 
tively. The four figures at the right-hand side give the corresponding 
CPU time in seconds. In each figure, the 26 test cases are listed in 
the r-axis in the same order as they appear in Table 4.3 (with 1 rep- 
resenting D24-p1l, and 26 representing IU-p2). Note that other things 
also consume sometimes non-negligible part of the CPU time, such as 
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incrementally building the BDD partitions, the creation and deletion of 
abstract FSM, etc. 

Figure 4.10 demonstrates that the forward reachability computation 
and computing refinement with GRAB often consume most of the CPU 
time. The backward reachability computation for building the SORs, 
on the other hand, often takes significantly less time than its forward 
counter-part, even though it collects all the shortest counterexamples. 
This is due to the application of forward onion rings as care sets in 
the corresponding pre-image computations. Furthermore, the actual run 
time of the concretization test is often small as shown by the “in-seconds” 
figure, even though it takes a significant amount in percentage from 
the total CPU time (as shown by the “in-percentage” figure). On this 
particular set of test cases, multi-thread concretization test is never the 
performance bottleneck—on the harder problems, test cases 19-26, the 
concretization test overhead becomes negligible. 

The performance of the forward reachability computation is limited 
by the capacity of the state-of-the-art BDD based symbolic techniques. 
(However, there may exist other examples on which BMC is extremely 
time-consuming; for them, the SAT multi-thread concretization test will 
take a significant amount of CUP time.) As the abstract model gets 
larger, BDD based computations become more and more expensive. The 
size the abstract model also affects the overhead of the GRAB refine- 
ment algorithm—the size of the BDDs for representing the SORs often 
becomes larger as the model gets more complex. In addition, a larger 
abstract model may have more invisible variables in its local support, 
which means that more CPU time needs to be spent on scoring them. In 
general, this trend is true for almost all abstraction refinement methods. 


4.6 Further Discussion 


The GRAB refinement algorithm differs from [CGJ*00, CGKS02] and 
other single counterexample guided algorithms [CCKt02, BGG02] in 
that: (1) it handles all shortest abstract counterexamples rather than a 
single counterexample; (2) at each abstract counterexample level, a set 
of abstract states, instead of just one abstract state, is used to constrain 
the unrolled concrete model at each time step in concretization test; (3) 
the refinement is based on the systematic analysis of all the spurious 
counterexamples in the SORs with à game-based heuristic algorithm. 

Since the GRAB refinement variable selection method operates solely 
on the abstract model and its local support variables, it is more scalable 
than those methods that involve symbolic computations in the concrete 
model. To our knowledge, the authors of [CGKS02, CGKS02] also had 
some preliminary experiments with multiple counterexamples and trans- 
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lation of multiple counterexamples to the SAT problem for invalidation, 
although the work has not been published. 

The refinement algorithm in [GKMH * 03] also relies on analyzing mul- 
tiple counterexamples. In their approach, multiple abstract error traces 
are represented by a data structure called the multi-valued counterex- 
ample. However, their multi-valued counterexample do not guarantee to 
capture all the shortest ones, making it incapable of catching concretiz- 
able counterexamples at the earliest possible refinement step. Further- 
more, their variable selection algorithm is based on the classification of 
invisible variables into strong 0/1 signals and conditional 0/1 signals. 
We have shown that strong 0/1 signals are rare cases in practice, and as 
a result, their refinement algorithm is often less accurate than GRAB. 

In [MH04], Mang and Ho proposed a refinement algorithm based 
on controllability and cooperativeness analysis. Their cooperativeness 
analysis extracts a small subset of candidate input signals by using a 3- 
value simulation engine [WHL*01] to simulate the abstract counterex- 
amples and then ranking all the inputs (i.e., invisible state variables 
and BNVs) according to various criteria. Their controllability analy- 
sis is independent of any particular counterexample; it is applied to a 
subset of input signals by scoring them according to a game-theoretic 
formula derived from the SORs. These two proposed analysis are then 
carefully integrated together to better refine the abstract model. Their 
controllability analysis is an improvement of the GRAB algorithm. Their 
experimental results showed a significant improvement over both GRAB 
and the REN method in [WHL*01]. 

The proof-based abstraction refinement methods in [MA03, LWS03, 
GGYAO03, LS04, LWS05, ZPHS05] also handle implicitly all the coun- 
terexamples of a finite length. These methods differ from ours in that 
their refinement variable selection algorithms are all SAT based, i.e., 
relying on the SAT solver's capability to produce succinct unsatisfia- 
bility proofs. In contrast, our core refinement variable selection algo- 
rithm is pure BDD based, even though we use SAT as well in pre- 
dicting refinement direction and in the concretization test. We note 
that a small unsatisfiability proof, i.e., the one with a small subset of 
Boolean variables or clauses, does not automatically give a small refine- 
ment set [LS04, GGA05b]. 

Both proof-based and counterexample based methods have their own 
advantages and disadvantages. A detailed experimental comparison of 
GRAB with a proof-based refinement algorithm can be found in our 
recent paper [LWS05], showing that these two kinds of methods comple- 
ment each other on the various test cases. Amla et al. [ADK* 05] also 
published results of their experimental evaluation of the various SAT 
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and interpolation based abstraction methods. There is also a trend of 
combining counterexample based methods and proof-based methods in 
abstraction refinement [AM04]. 
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Table 4.2. Correlation between the final proofs of GRAB and CA. 
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strict 


strict 












5 strict 

2 strict 
D16-p1 531 8 14 16 16 14 2 strict 
D2Lpj 14v TO) no 
D5-p1 319 31 18 13 18 13 5 0 | strict 
Dam aT TO no 
D21-p1 92 26 66 79 81 64 2 15 no 
B-p1 124  T(18) | 18 19 19 18 0 l| strict 
B-p2 124 17 7 7 7 7 0 0 yes 
M0-p1 221 T(3) 16 19 21 14 2 5 no 
B-p3 124 T(4) 43 42 43 42 1 O | strict 
D21-p2 92 28 70 83 85 68 2 15 no 
B-p4 124 T(5) 42 43 43 42 0 l| strict 
B-p0 124  TY(17) 24 49 49 24 0 25 | strict 
rcu-pl 2453  T(3) 10 (9) ? i ? ? | strict 
D4-p2 230 'T(19) 38 (171) 2 ? ? ? ? 
IU-p1 4494 ? ? ? ? ? ? ? ? 
IU-p2 4494 ? ? ? ? ? ? ? ? 
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Table 4.8. Comparing GRAB, +FINEGRAIN, and +ARDC. 

COI COI  cex GRAB +FINEGRAIN +ARDC 
circuit regs gates len | time regs time regs time regs 
D24-p1 147 8k 9 1 4 1 4 1 4 
D24-p2 147 8k T 3 8 3 8 3 8 
D24-p3 147 2k T 20 8 4 6 2 5 
D24-p4 147 8k T 43 8 4 6 2 5 
D24-p5 147 8k T 4 6 2 5 
D12-p1 48 2k 16 24 23 19 24 
D23-p1 85 3k 5 20 21 3 21 14 21 
D5-pl 319 25k 3l 31 18 42 13 32 13 
D1-pl 101 5k 9 9 21 12 26 14 20 
D1-p2 101 5k 13 51 23 27 23 29 23 
D1-p3 101 5k 15 56 25 32 23 33 23 
D16-pl 531 34k 8 92 14 25 14 21 14 
D2-pl 94 1k 14 180 48 108 49 59 48 
MO-p1 221 29k T 136 16 204 13 942 13 
rcu-pl 243 38k T 195 10 188 10 216 10 
B-p0 124 2k T | 1256 24 | 1507 24 | 1484 24 
B-pl 124 2k T 173 18 189 19 | 159 18 
B-p2 124 2k 17 93 7 95 7 90 7 
B-p3 124 2k T 223 43 76 43 62 43 
B-p4 124 2k T 393 42 101 43 108 42 
D22-p1 140 7k 10 720 132 242 132| 191 132 
D4-p2 230 8k T 1103 38 204 38 | 195 38 
D21-p1 92 lk 26 | 2817 66 | 2725 70 | 622 67 
D21-p2 92 14k 28 | 4635 70 | 1748 75 | 868 67 
IU-p1 4494 154k T 2263 12 
IU-p2 4494 154k T 930 14 | 699 12 
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Figure 4.10. The CPU time distribution among the different phases of abstraction 


refinement: forward reachability analysis, computing SORs, multi-thread concretiza- 
tion test, and computing the refinement set with GRAB. 


Chapter 5 


COMPOSITIONAL SCC ANALYSIS 


In abstraction refinement, the given property needs to be checked 
repeatedly in the abstract model, while the model is gradually refined. 
Information learned from previous abstraction levels can be carried on to 
help the verification at the current level. The major problem, however, 
is to identify the information that needs to be carried on and to apply 
it to improving the computation efficiency. 


In this chapter, we propose a compositional SCC analysis framework 
for LTL and fair CTL model checking. In this framework, the SCC 
partition of the state space from the previous abstract model is carried 
on to the next level. When we check the current abstract model, previous 
SCC partitions can be used as the starting point for computing the new 
SCC partition. 


We also exploit the reduction in automaton strength during abstrac- 
tion refinement to speed up the verification procedure. The strength of 
a Büchi automaton affects the complexity of checking the emptiness of 
its language. For weak or terminal automaton [KV98, BRS99], special- 
ized fair cycle detection algorithms often outperform the general one. We 
have found that composing two automata together may reduce, but may 
never increase the automaton strength. Therefore, even if the abstract 
model or property automaton is initially strong, its strength can be re- 
duced through the successive refinements. In this chapter, we propose 
a new algorithm that dynamically selects model checking procedures to 
suit the current strength of the individual SCCs, and therefore takes 
maximal advantage of their weakness. 
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5.1 Language Emptiness 


Checking language emptiness of a Büchi automaton is a core proce- 
dure in LTL [LP85, VW86] and fair CTL model checking [McM94], and 
in approaches to verification based on language containment [Kur94]. 
The cycle detection algorithms commonly used in symbolic model check- 
ers fall into two categories: one is based on the computation of an SCC 
hull [EL86, HTKB92, TBK95, KPR98, SRB02], and the other is based 
on SCC enumeration [XB00, BGS00, GPPO03]. Although some SCC enu- 
meration algorithms [BGS00, GPP03] have better worst-case complexity 
bounds than the SCC hull algorithms—O( log 7) or O(n) versus O(n’), 
where 77 is the number of states of the system—the comparative study of 
[RBSO0] shows that the worst-case theoretical advantage seldom trans- 
lates into shorter CPU times. In many practical cases, applying any of 
these symbolic algorithms directly to the entire system to check language 
emptiness remains prohibitively expensive. 


In abstraction refinement, language emptiness is checked repeatedly 
in the abstract model while the model is gradually refined. It is natural 
to ask whether information learned from previous abstract models can 
be carried on to the current level to improve the computation efficiency. 
The major problem is to identify the kind of information that can be 
carried on, and to find ways of using it to speed up the verification. 


Although model checking applied to the abstract models may have 
conservative result, the SCC partition of the state space computed in the 
abstract model still provides valuable information for language emptiness 
checking of the concrete system (or a more refined abstract model). 
Given a model A and an over-approximation A, every SCC in A consists 
of one or more complete SCCs of A. In other words, an SCC in the 
concrete system must be either included in or excluded completely from 
an SCC in the abstract model. Let II be the set of SCCs in A; then 
II is a refinement of the set of SCCs in A. Similarly, an SCC C in 
A is a refinement of another SCC C" in A’ if C C C’. If an SCC in 
the abstract model does not contain a fair cycle, none of its refinements 
will. Therefore, it is possible to inspect the fair SCCs in A first, and 
then refine them individually to compute the fair SCCs in A. 


We will present a compositional SCC analysis framework for language 
emptiness checking called DNC (for Divide and Compose), which is 
based on the enumeration and the successive refinement of SCCs in a set 
of over-approximated abstract models. By combining appropriate cycle- 
detection algorithms (SCC hull or SCC enumeration) into the general 
framework, we create a hybrid algorithm that shares the good theoreti- 
cal characteristics of SCC enumeration algorithms, while outperforming 
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the most popular SCC-hull algorithms, including the one by Emerson 
and Lei [EL86]. 

The procedure starts by performing SCC enumeration on the most 
primitive abstract. model, which produces the initial SCC partition of 
the entire set of states. The partition is then made more accurate on a 
refined abstract model—one that is usually the composition of the cur- 
rent abstract model and a previously omitted submodule. SCCs that do 
not contain fair cycles are not considered in any more refined model. If 
no fair SCC exists in an abstract model, the language of the concrete 
model is proved empty. If fair SCCs exist, the abstract counterexamples 
can be checked against the concrete model in a way similar to the SAT 
based concretization test; the existence of a real counterexample means 
the language is not empty. When the concrete model is reached, the pro- 
cedure terminates with the set of fair SCCs of the concrete model. Since 
each concrete SCC is contained in an SCC of the abstract model, SCC 
analysis at previous abstraction levels can drastically limit the potential 
space that contains a fair cycle. 

In language emptiness checking, we regard the model as a generalized 
Büchi automaton. The strength of a Büchi automaton is an important 
factor for the complexity of checking the emptiness of its language. As 
shown in previous work [KV98, BRS99], when an over-approximation 
of A is known to be terminal or weak, specialized algorithms exists for 
checking the emptiness of the language in A. These specialized algo- 
rithms usually outperform the general language emptiness algorithm. 
However, the previous classification of strong, weak, and terminal was 
applied to the entire Büchi automaton. This can be inefficient, because 
a Büchi automaton with a strong SCC and several weak ones would be 
classified as strong. In this chapter, the definition of strength is extended 
to each individual SCC so that the appropriate model checking proce- 
dure can be deployed at a finer granularity. In addition, it is shown that 
the strength of an SCC never increases during the composition, but may 
actually decrease as submodules are composed. After the composition, 
a strong SCC can break into several weak SCCs, but a weak one cannot 
generate strong SCCs. Our DNC algorithm analyzes SCCs as they are 
computed to take maximal advantage of their weakness. 

'The DNC algorithm achieves favorable worst-case complexity bound, 
i.e., O(n) or O(nlog n), depending on what underlying SCC enumeration 
algorithm is used. This complexity bound is valid even when DNC adds 
one submodule at the time until the concrete system is reached. In prac- 
tice, however, the effort spent on the abstract systems can be justified 
only if it does not incur too much overhead. As the abstract system 
becomes more and more concrete through composition, SCC enumera- 
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tion in the abstract model may become expensive. In such cases, the 
algorithm jumps directly to the concrete system, with all the useful infor- 
mation gathered from the abstract systems. Based on the SCC quotient 
graph of the abstract model, it disjunctively decomposes the concrete 
state space into subspaces. Each subspace induces a Büchi automaton 
that is an under-approximation of the concrete model; therefore, it ac- 
cepts a subset of the original language. The decomposition is also exact 
in the sense that the union of these language subsets is the original lan- 
guage. Therefore, language emptiness can be checked in each of these 
subautomata in isolation. By focusing on one subspace at a time, we 
can mitigate the BDD explosion during the most expensive part of the 
computation—detecting fair cycles in the concrete system. 


5.2 SCC Partition Refinement 


We start with the definition of over-approximation of a generalized 
Büchi automaton, followed by theorems that provide the foundation for 
the SCC analysis algorithm. Automaton A’ is an over-approximation 
of A, if S = 8’, So C Sy, T C T”, F D Ff’, and A = A’. An over- 
approximation always simulates the original automaton, which is de- 
noted by A < A’. Given a set of automata defined in the same state 
space and with an alphabet originated from the same set of atomic 
propositions, the simulation relation - induces a partial order. 


THEOREM 5.1 (COMPOSITIONAL REFINEMENT) Let A,Aj,...,An be a 
set of labeled generalized Biichi automata such that A < A; for 1 < i € n. 
Then, the set of SCCs II(.A) is a refinement of 


O = (Cin---nCO, | C; e C A)) NO . 


PROOF: Every state in an SCC C € II(.A) is reachable from all other 
states in C. An over-approzimation A; preserves all the transitions of 
A, which means that in A;, every state in C remains reachable from the 
other states in C. Therefore, for 1 < i € n, C is contained in an SCC 
of Ai; hence it is contained in their intersection, which is an element of 
O. Since the union of all SCCs of A equals S and distinct elements of 
O are disjoint, O is a partition of S, and II(.A) is a refinement of it. 


In particular, (A) is a refinement of an SCC partition of any over- 
approximation of A; thus, an SCC of A’ is an SCC-closed set of A. 
This theorem allows one to gradually refine the set of SCCs in the ab- 
stract models until H(A) is computed. It can often be decided early 
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that an SCC-closed set does not contain an accepting cycle. For lan- 
guage emptiness checking, these non-fair SCC-closed sets are ignored. 
By working on only "suspect" SCCs, one can trim the state space with 
cheap computations in the simplified models. 


OBSERVATION 5.2 Let C be an SCC-closed set of A. If CO F, = 0 for 
any F; € F, then C has no states in common with any accepting cycle. 


Recall that we have defined the concrete model A as the synchronous 
(or parallel) composition of a set of submodules. Composing a subset 
of these submodules gives us an over-approximated abstract model A’. 
For instance, let A = A, || A2, then both A; and Az can be considered 
as over-approximations of A. It follows that an SCC in either A, or Ag 
is an SCC-closed set in A. 


DEFINITION 5.3 The strength of a fair SCC C is defined as follows: 
m C is weak if all cycles contained within it are accepting. 


m C is terminal if it is weak and, for every state s € C and every propo- 
sition a € A, the successor s' of s is in a terminal SCC. Terminality 
implies acceptance of all runs reaching C. 


m C is strong if it is not weak. 


Strength is defined only for fair SCCs. The strength of an SCC-closed 
set containing at least one accepting SCC is the maximum strength of its 
fair SCCs. The strength of an automaton is also the maximum strength 
of its fair SCCs. Our definition of weakness is more relaxed than that of 
[KV98, BRS99]. Previously, all the states of a weak SCC must belong 
to all fair sets F; € F, and a terminal SCC must be maximal (i.e., no 
successor SCCs). Our new definition is more relaxed, while still allowing 
the use of faster symbolic model checking algorithms in the same way. 


LEMMA 5.4 Given a labeled generalized Büchi automaton .A, if C is a 
weak (terminal) SCC of an over-approzimation A’ of A, then it contains 
no reachable fair cycle in A if and only if EF EG CN So 40 (EF CN So Z 
0) holds in A. 


EF C is the subset of states in S that can reach the states in C, while 
EG C is the subset of states in C that lead to a cycle lying in C. Assume 
that C is a terminal SCC in A’, and a state s € C is reachable from the 
initial states in A. Since for every proposition a € A, the successor s’ of 
s in some terminal SCCs, all runs reaching s in A remain inside terminal 
SCCs afterward. Due to the finiteness of the automaton, these runs form 
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cycles. Since terminal SCCs are also weak, all these runs are accepting. 
Therefore, the language is not empty if and only if EF C N So Z 0. If C 
is a weak SCC in A’, and a state s € C is reachable from Sp in A and at 
the same time s € EG C, there exists a run through s that forms a cycle 
in C. Since all cycles in the weak SCC C are accepting, the language is 
not empty if and only if EF EG C N So Z ñ. Note that for a strong SCC, 
one must resort to the computation of EGfair true. 


THEOREM 5.5 (STRENGTH REDUCTION) Let A and A’ be Büchi au- 
tomata such that A and A’ are complete and A < A’. If C is a weak 
(terminal) SCC-closed set of A’, then C is a weak (terminal) SCC-closed 
set of A. 


PROOF: We prove this by contradiction. Assume that C is a weak set 
of A’, but is a strong set of A. Then, at least one cycle in C is not 
accepting in A. As an over-approzimation, A’ preserves all paths of A, 
including this non-accepting cycle, which makes C a strong set of A’ too. 
However, this contradicts the assumption that C is weak in A’. There- 
fore, C cannot be a strong set of A. A similar argument applies to the 
terminal case. 


In other words, the strength of an SCC-closed set never increases as 
a result of composition. In fact, the strength may actually reduce in 
going from A’ to A. For example, a strong SCC may be refined into 
one or more SCC, none of which is strong; a weak SCC may be refined 
into one or more SCCs, none of which is weak. This strength reduction 
theorem allows us to use special model checking algorithms inside the 
abstraction refinement loop as soon as a strong SCC-closed set. becomes 
weak or terminal. 

Deciding the strength of an SCC-closed set strictly according to its 
definition is expensive. In the actual implementation, we can make con- 
servative decisions of the strength of an SCC C as follows: 


m C is weak if C C F; for every F; € Z; 


= C is terminal if C is weak, and either (EY C)\C = 0, or (EY CONC C 
C, where C, is a terminal SCC; 


m C is strong otherwise. 


We use the example in Figure 5.1 to show the impact of composition. 
The three Büchi automata have one acceptance condition (Z = {F1} 
and are defined in the same state space. State 01 is labeled ~p; all other 
States are labeled true implicitly. Double circles indicate that the states 
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2 strong SCCs 


2 strong SCCs 


B X "RO 
Qs 


1 weak SCC, 1 terminal SCC 


Figure 5.1. Parallel composition of automata and its impact on SCCs. 


satisfy the acceptance condition. In this figure, the parallel composition 
of the two automata at the top produces the automaton at the bottom. 
Note that only transitions that are allowed by both parent automata 
appear in the composed system. Both automata at the top are strong, 
although their SCC partitions are different. The composed system, how- 
ever, has a weak SCC, a terminal SCC, and two non-fair SCCs. Its SCC 
partition is a refinement of both previous partitions. 


5.3 The D’n’C Algorithm 


Theorems in the previous section motivate the generic SCC analy- 
sis algorithm in Figure 5.2. The Divide and Compose (D’n’C) algo- 
rithm [WBH*01], whose entry function is GENERIC-REFINEMENT, takes 
as arguments a Biichi automaton A and a set L of over-approximated 
abstract models, which includes A itself. The relation < that is applied 
to over-approximated models in L is not required to be a total order. 
The procedure returns true if a fair cycle exists in A, and false otherwise. 
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type Entry = record 
C; // An SCC-closed set of .A 
L’; // Set of abstract models that have been considered 
8 // Upper bound on the strength of the SCC 

end 


GENERIC-REFINEMENT(.A, L)( // Concrete and abstract models 


var Work: set of Entry; 
Work = ((S, 0, strong) }; 


while (Work Z 6) í 


Pick an entry E = (C, L’, s} from Work; 
Choose A’ € L such that no A” € L/ with A” < A’; 
if (A = Aor ENDGAME(C,s)) í 
if (MODEL-CHECK(A, C, s)) 
return true; 


else { 
Over-approx. reachability computation on A’; 
C := SCC-DECOMPOSE(C, A’); 


if (CZÜand A =A) 


return true; 


for (alCeoOf( 
8 := ANALYZE-STRENGTH(C, A’); 
insert (C, L' U {A’}, s) in Work; 


} 


return false; 


} 


MODEL-CHECK( A, C,s){ // Automaton, SCC-closed set, strength 


case (s) í 
strong: return Qo 1EGz(C) Z 0; 
weak: return Qo MEFEG(C) z (; 
terminal: return Qo EF(C) Z 0; 


Figure 5.2. The generic SCC analysis algorithm D'N'C. 
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The algorithm keeps a set Work of obligations, each consisting of a 
set of states, the series of abstract models that have been applied to it, 
and an upper bound on its strength. Initially, the entire state space is in 
Work, and the algorithm keeps looping until Work is empty or a fair SCC 
has been found. The loop starts by selecting an element (C, L’, s) from 
Work and a new abstract model A’ from L. If A’ = A, the algorithm 
may decide to run a standard model checking procedure on the SCC 
at hand. Otherwise, it decomposes C into accepting SCCs and after 
analyzing their strengths, adds them as new Work. At any stage, for 
any entry (C, L/, s) of Work, C is guaranteed to be an SCC-closed set 
of .A, and the sets of states in Work are always disjoint. Termination 
of the procedure is guaranteed by the finiteness of L and of the set of 
SCCs of A. 

The algorithm uses several subroutines. Subroutine SCC-DECOMPOSE, 
takes an automaton A’ and a set C, intersects the state space of A’ with 
C to yield a new automaton A”, and returns the set of accepting SCCs of 
A”. The subroutine avoids working on any non-fair SCCs, as justified by 
Observation 5.2. Subroutine ANALYZE-STRENGTH returns the strength 
of the SCC-closed set. Subroutine MODEL-CHECK returns true if and only 
if a fair cycle is found using the appropriate model checking technique 
for the strength of the given SCC. 

The way entries and abstract models are picked is not specified, and 
neither is it stated when ENDGAME returns true. These functions can 
depend on factors such as the strength of the entry, the abstract models 
that have been applied to it, and its order of insertion. In later sections, 
these functions will be made concrete. 

When decomposing an SCC-closed set C, the complement set C can be 
used as the don’t care conditions to constrain and speed up the computa- 
tion. This is usually a significantly larger don’t care set than Reachability 
Don’t Cares (RDCs); therefore, the use of C as don’t cares can lead to a 
significant improvement in the computation efficiency. The time spent 
on computing C is small because it is from an abstract model where 
image and pre-image computations are cheaper than in the concrete 
system. 

The reachable states of the current over-approximation are kept around 
to discard unreachable SCCs. When the system is refined, the set of 
reachable states is computed anew, but this computation is limited to 
the previous reachable states, because the new reachable states are al- 
ways contained in the previous reachable states as long as the new ab- 
stract system is a refinement of the previous one. Therefore, previous 
reachable states can be used as a care set in computing the new ones. Al- 
though reachability analysis is performed multiple times (once for every 
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A' € L), previous work of [MJH* 98] has shown that the use of approxi- 
mate reachability information as a care set may more than compensate 
for the overhead. 


The proposed algorithm can be extended to include the use of under- 
approximations as well. Whereas over-approximations can be used to 
discard the possibility of an accepting cycle, under-approximations can 
be used to assert its existence. Let A; and Ag be under-approximations 
of A. If A, contains an accepting cycle, so does A. Furthermore, if an 
SCC Ci of A, and an SCC Cs of Az overlap, then A contains an SCC 
C D Ci U C». Remember that SCC enumeration algorithms [XB00, 
BGS00, GPP03] compute each SCC by starting from a seed, which nor- 
mally is a single state. Since we know that both C, and C5 are subsets 
of C, we can use C, U C» as the seed to compute C. By doing so, part 
of the work went into computing C, or Co can be reused. 


Under-approximations of an SCC can also be used as don't cares in 
BDD based symbolic algorithms to restrict the computation to a state 
subspace. For instance, in SCC enumeration, as soon as we know that 
C1 U Ch is a subset of an SCC, we can focus our attention on the state 
subspace C1 U C2. We can modify the transition relation of the model 
to reflect this shift of attention (with the goal of finding smaller BDD 
representations for the transition relation and sets of states). 'To better 
understand the use of DCs, let us review some background information 
about the implementation of BDD based image computation. 


Image and pre-image computations are the most resource-consuming 
steps in BDD based model checking. Since their runtime complexity 
depends on the BDD sizes of the operators involved, it is important to 
minimize the sizes of both the transition relation and the argument to 
the (pre-)image computation—the set of states. The size of a BDD is 
not directly related to the size of the set it represents. If we need not 
represent a set exactly, but can instead determine an interval in which 
it may lie, we can use generalized cofactors [CBM89b, CM90] to find a 
set within this interval with a small BDD representation. 


Often, we are only interested in the results as far as they lie within a 
care set K (or outside a don’t care set K). Since the language emptiness 
problem is only concerned with the set of reachable states R, we can 
regard R as a care set, and add or delete edges in the state transition 
graph that emanate from unreachable states. By doing this, the image 
of a set that is contained within R remains the same. Likewise, the part 
of the pre-image of a set S that intersects R remains the same, even 
if unreachable states are introduced into S by adding edges. This use 
of the states in R as don’t cares, which is often called the reachability 
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don't cares or RDCs, depends on the fact that no edges from reachable 
to unreachable states are added. 

SCC-closed sets are care sets that are often much smaller than the 
set of reachable states, and thus can be used to increase the chance of 
finding small BDDs. We cannot, however, use the approach outlined for 
the reachable states directly, since there may be edges added from an 
SCC-closed set to other states, as the one from State 4 to State 6 in 
Figure 5.3. 





Figure 5.8. An example of using don't cares in the computation of SCCs. 


We show here that in order to use arbitrary sets as care sets in image 
computation, a "safety zone" consisting of the pre-image of the care set 
needs to be kept; similarly for pre-image computation, a “safe zone" 
must consist of the image of the care set. 


THEOREM 5.6 Let Q be a set of states and let T C Q x Q be a transition 
relation. Let K C Q be a care set, B C K a set of states. Finally, let 
T' C Qx Q be a transition relation and B' C Q a set of states such that 


Th(KxK)cT'cTU(KxQ)U(Q x K), and 
BCB'C BUEX?(K) . 
Then, EY, (B') n K = EYr(B)N K. 


PROOF: First, suppose that q' € EY» (B!) N K, and let q € B' be such 
that q! € EYr({q}) NK. Since d' € EYr({q}), so q € EX, (g), and 
because d' € K, we have q € EXp(K). Hence, q € B' implies q € B, 
and q,q' € K, which means that q € EYr({q}) OK. Finally, q € B 
implies g' € EYr(B) OK. 
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Conversely, suppose that q! € EYr(B)r K, and let q € B be such that 
q €EYr({q}) n K. Nov q,q' € K, and hence q' € EYr({q}) NK, and 
since q € B', qd € EYp (B) K. 


Hence, we can choose T” and B' within the given intervals so that they 
have small representations, and use them instead of T' and B. Through 
symmetry, we can prove the following theorem. 


THEOREM 5.7 Let Q be a set of states and let T C Q x Q. Let K C Q, 
B C K, T” C Q x Q, and B' C Q be such that 


Tn(K x K) C T' CTU(K x Q)U (Q x K), and 
BCB'CBUEYmp(K) . 


Then, EX, (B!) n K = EX7(B) NK. 


Edges are added to and from states in the set K (states outside K), 
while the safety zone for (pre-)image computation excludes the immedi- 
ate (successors) predecessors of K. Note that the validity of the afore- 
mentioned use of the reachable states as care set follows as a corollary 
of these two theorems. Figure 5.4 shows a possible choice of 7" given 
the same T' and K of Figure 5.3. 





Figure 5.4. Another example of using don't cares in the computation of SCCs. 


For that choice of T’, it shows, enclosed in the dotted line, the set 


EX; (K). If B = (1,2), then EYr(B) n K = {2,3,4}. Suppose B' = 
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{1, 2,4}; then 


EYp(B)nK = {2,3,4,6} n {0, 1, 2, 3, 4} 
= {2,3,4} 
= EYr(B) AK . 
Note that the addition of the edge from State 7 to State 3 causes the 
former to be excluded from EXy (K). 


5.4 The Composition Policies 


The SCC analysis algorithm described in previous section is generic, 
since it does not specify: 


1 what set of abstract models L is available; 


2 the rule to select the next abstract model A’ to be applied to a set 
5; 


3 the priority function used to choose which element to retrieve from 
the Work set; 


4 the criterion used to decide when to switch to the endgame. 


These four aspects make up a policy and are the subjects of this section. 
We assume that A is the parallel composition of a set of submodules 


M = {M,,...,Mm}, and the set L of over-approximations consists of 
the compositions of subsets of M: 
LC [M;, | NA | Mj, | {ji -3 Jp} c (1,...,m)) ° 


We also assume that states of A are the valuations of a set of r binary 
variables V. The set of variables controlled by each module M, is non- 
empty and is a subset of V. Furthermore, let 74 and m, be the numbers 
of states in A and its over-approximation respectively, then 25 < na. 

The set of all over-approximations generated from subsets of M forms 
a lattice under the relation <, as is shown in Figure 5.5 for m = 4. In 
the case illustrated by this figure, the coarsest abstraction, which is the 
set of no module, is the 1 of the lattice. Note that this abstraction is 
never used in practice. The concrete system is the composition of all four 
modules. For sufficiently large m, it is impractical to make use of all 2” 
abstract models; consequently, we shall only consider efficient policies 
in which any given state contained in the SCC-closed set is passed to 
SCC-DECOMPOSE at most O(r) times. 

Specifically, we shall stipulate that there is a constant A, such that L 
can be partitioned into subsets Li,..., Lr satisfying the following con- 
ditions: 
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0 modules 


1 module 


2 modules 





3 modules Mı lI M» f M3 Mı ll M» ll Ma4 ` Mi | M3 ll Ma M» | M3 |! Ma 


A 


4 modules Mı || M2 || Ms || Ma 


Figure 5.5. Lattice of approximations. 


1 IL;] < À; 
2 for every A’ € Li, ny < 25 
3 AEL. 


Two efficient policies satisfying these conditions are illustrated in Fig- 
ure 5.5. The first one is called the popcorn-line policy, which corresponds 
to the solid thick lines at the left of Figure 5.5. Let (j1,...,jm) be a 


permutation of (1,...,m) that identifies a linear order of the modules. 
The set of approximations with a popcorn-line policy is given as follows, 
L-(A-Mjl--lMilisisn). 


When an entry E = (S, L/, s) is retrieved from Work, the A; of lowest 
index that is not present in L/ is chosen as the next approximation 
ZA. With (j1,...,j4) = (1,2,3,4) and À = 1, the approximations in 
Figure 5.5 are: 


Ai eem Mi, 

A = M || Ms, 

As = Mi || Ms || Ms, 

Aq = Mi Mo || Ms || Ma. 


Another policy is called the lightning-bolt policy. In Figure 5.5, the 
lightning-bolt policy is indicated by thick gray lines at the right. Let 
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(Ji, ---, Jm) be a permutation of (1,...,m) that identifies a linear order 
of the modules. The set of approximations with this policy is 


L= {Aż = Mj, || --- |] M | 1<isnju 
Asi = Mj |1<i<n) : 
When an entry E = (S, L’, 8) is retrieved from Work, among the two A;, 


the one with lower index is chosen first. Let the order of submodules be 
(4, 2, 3, 1); the set of approximations in Figure 5.5 is: 


A = Ma, A = Mp, 
Aa = Ma || Mo, Aq = Ms, 
As = Ma || M» || Ms, As = M, 


Ar = Ma || M2 || Ms || Mi. 


In both cases, the times a state appearing in the set passed to scc- 
DECOMPOSE is bounded by the number of approximations in L. There- 
fore, a popcorn-line policy tends to call SCC-DECOMPOSE fewer times. 
A lightning-bolt policy may break up the SCC-closed sets with easy ap- 
proximations ({A2;}) before applying harder approximations ({A2;—1}) 
to them, and therefore tends to use less memory. 

The popcorn-line approach defines an SCC partition refinement tree, 
whose roots are the fair SCCs in the most abstract model and whose 
other nodes correspond to fair SCCs in the more refined abstract mod- 
els. An example is given in Figure 5.6, which highlights the potential ad- 
vantages of SCC refinement. The figure corresponds to a model of eight 
dining philosophers, with a property that states that under the given 
fairness constraints, if a philosopher is hungry, she eventually eats. The 
system has nine modules, which are the property automaton and the 
eight modules for the philosophers. The property passes in the concrete 
model, i.e., no fair cycles exist in the system. 

Only the nodes representing fair SCCs are shown in this tree. The 
nodes at Level i are the fair SCCs of A;, together with their numbers 
of concrete states. A; is the property automaton. A separate reacha- 
bility analysis shows that there are about 47k reachable states in the 
concrete model. Note that only very small sets of states remain after 
the composition of the first four modules—the property automaton, the 
philosopher named in the property, and her two neighbors. We are able 
to prune away a lot of reachable states in the first few levels, and that 
no work is done on the concrete system. 

To define a policy, the order in which elements are retrieved from 
the Work set also needs to be specified. Two obvious choices are FIFO 
and LIFO order. As one would expect, the SCC refinement tree can 
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be traversed in breadth-first manner for a FIFO order, and in depth- 
first manner for a LIFO order. When, as in Figure 5.6, there are no fair 
cycles in .A, the order in which the tree is visited is immaterial. However, 
in the presence of concrete fair cycles, one strategy may lead to earlier 
termination than the other may. If one assumes that fair cycles are 
numerous, then depth-first search is particularly attractive. Breadth- 
first search, on the other hand, can be implemented with low overhead, 


because at any time, only one abstract model needs to be constructed 
and to remain active in the main memory. 
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Figure 5.6. An SCC partition refinement tree. 


Chapter 6 


DISJUNCTIVE DECOMPOSITION 


6.1 Adaptive Popcorn-line Policy 


It may not be practical to adopt the popcorn-line policy all the way 
down to the concrete model, because 


1 there may be too much overhead in analyzing all the abstract models; 


2 if an SCC-closed set becomes weak or terminal, checking it directly 
in the concrete model may be cheap. 


In these two cases, one may decide to switch to the endgame. That is, 
after spending a reasonable amount of effort on decomposing the SCCs in 
the abstract models, we jump to the concrete model. When the endgame 
comes, there are different ways of jumping to the concrete system—all 
of them can be considered as variants of the popcorn-line policy. 

The first variant is to go from the last abstract model A’ to A directly, 
and search for fair cycles inside each SCC-closed set S in Work. Both 
SCC hull and SCC enumeration algorithms can be used for the fair-cycle 
detection in A. Assume, for instance, that A is the composition of the 
set of submodules (Mi, ..., Mg), and we decide to jump after composing 
the first three submodules. The first variant of the popcorn-line policy 
can be described as follows: 


Ai zm Mı, 

A = M || Mo, 

As = Mi || M» || Ms, 

Ag = Mi || M2 || Ms || --- || Ma. 





Alternatively, we can trim the SCC-closed sets of A’ further before 
searching for fair cycles in the concrete model. Remaining submodules 
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are applied, one at a time, to further partition these SCC-closed sets. 
This variant, called the Cartesian product approach, is characterized as 


follows: 
Ai < Mi, 
Ao = Mil| Mo, 
Az = M, || Ms || Ms, 


Aa = MA, 
As = Ms, 
Ag = Mi| M2 || Ms ||--- || Ms. 


Note that we are using A4—g to further reducing the SCC closed sets 
of A3, before going to the concrete model Ag. Given the fact that 
each submodule M, is relatively small and we consider them one at 
a time, the calls to SCC-DECOMPOSE in A4_g are cheap. In fact, the 
partition of the state space in A3 has been based on the assumption 
that the state variables of other submodules are free variables, i.e., they 
can take arbitrary values at all times). Calling SCC-DECOMPOSE on 
these remaining modules individually can constrain their state variables, 
resulting in further partitioning of the SCC-closed sets. A direct analogy 
can be observed between this approach and the Machine-by-Machine 
state space traversal algorithm [CHM *94] in computing the approximate 
reachable states. 

The third variant, called the one-step further composition approach, 
is characterized as follows: 


Ai = Mi, 

Ao = M, || Ms, 

As = Mi || M» || M3, 
Ay = As || Ma, 





Ag = As || Ms, 
Ag = Mi || M> || Ms || : : || Ma. 


We are using .A4..8 to further fracturing the SCC closed sets of A3. Note 
that the previous Cartesian product approach does not compose prior 
to making the full jump; in contrast, this one-step-further composition 
approach invests more heavily by composing A3 with each of the re- 
maining submodules. At each step from A4 to Ag, we start with the 
refined SCC-closed sets computed in the previous step. For a transition 
to exist in the composition, it must exist in both of the machines be- 
ing composed. Whereas the Cartesian product approach never fractures 
SCCs by this joint constraint, this third variant does, ultimately leading 
to the partitioning of these SCCs into smaller SCC-closed sets. 
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Following this line to the extreme would lead us all the way back to 
the original popcorn-line policy, which is considered the forth variant. 
We note that these four variants are only representatives of the general 
framework of adaptive popcorn-line policy. The first variant represents 
the least investment in compositional analysis, and therefore suffers the 
least amount of overhead. However, when it performs the most expensive 
part of the computation—cycle detection in the exact system—it must 
search in larger state subspaces. Conversely, the fully iterative approach 
of the forth variant has the smallest state subspaces to search in the 
concrete model, but incurs the greatest overhead in analyzing abstract 
models. 


6.2 Disjunctive Decomposition Theorem 


After switching to the endgame and further fracturing the SCC-closed 
sets, we shall search for reachable fair cycles in the concrete system. 
Since the concrete system often has a large number of states, it is desir- 
able to decompose the entire state space into many subspaces and search 
them separately. This requires the decomposition of state space to be 
disjunctive — that is, the language accepted by each subspace is a subset 
of the original language, and the union of these language subsets is the 
original language. 

Let the SCC quotient graph of a labeled generalized Büchi automaton 
A be G = (C, Co, Tc, Fc), where C is the set of SCCs in A, Co C C is the 
set of initial SCCs, Tc C C x C is the transition relation, and Fo is the 
set of fair SCCs. 

Let G’ = (C, Co, To, FG) be a subgraph of €, where C' C C, G; C Co, 
To € To, and Fç C Fo. In other words, removing some nodes or 
edges of an SCC graph, or making some fair nodes non-fair, produces a, 
subgraph. 

A subgraph of the SCC graph G(A) induces a new Büchi automaton. 


DEFINITION 6.1 Given the SCC quotient graph G of A and a subgraph 
G', the new induced automaton A J G' = (S', S0, T', A, A, F') is defined 
as follows: 


m S'C S is the subset of original states that appear in C', 
= S, C So is the subset of original initial states in G, 


m T' C T is the subset of original transitions among the states that 
appear in C'. 


a F; € F' is the subset of F; € F; that is, the intersection of F; and 
SCCs in Fo. 
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In other words, A JJ Z! is the original automaton A restricting its oper- 
ation only in the set of states S”. It follows that A JJ G(A) = A. 

Since accepting runs in the induced automaton A |) G' are always 
accepting in A, the language of the new automaton is a subset of L(A). 
Furthermore, the pruning operation on the SCC graph, defined as remov- 
ing nodes that are not on any path from initial nodes to fair nodes, does 
not change the language accepted by the corresponding automaton. This 
claim can be extended to the SCC subgraph of any over-approximation 


of A. 


OBSERVATION 6.2 Let A < A’ and G' be a subgraph of G(.A'), then 
£(A 4 9) € L(A). 


We define an SCC subgraph GC; (A) for every fair node C; of G(.A), 
such that it contains all SCCs that are on the paths from the initial 
SCCs to C; (including C); furthermore, all the nodes are marked non- 
fair except for C;. We can build GO by marking Cj fair and all the other 
nodes non-fair and then pruning the SCC graph. When the context is 
clear, we will simply use GC; to denote such a subgraph. Each SCC 
subgraph G° induces a new automaton that accepts a subset of the 
original language; the union of these subsets of languages is the same as 
the language of the original automaton. 

In addition, G@ can be further decomposed into subgraphs. An SCC 
subgraph of this kind, denoted by gr 3 represents the i-th path from an 


initial SCC to C}. The languages accepted by the automata A | ge J also 
form a disjunctive decomposition of the language accepted by A J G%. 
To summarize, we have the following theorem: 


THEOREM 6.3 (DISJUNCTIVE DECOMPOSITION) Let A < A’ and the 
SCC graph G(.A') has a set of SCC subgraphs {Gr} as defined above. 
Then, L(A) = 0 if and only if L(A M GE) = () for every subgraph. 


A new automaton in the form of A |j ge ^ is an under-approximation 
of the exact system. Normally, an under-approximation can be used to 
certify the existence of fair runs, but not to prove language emptiness. 


However, Theorem 6.3 shows that the set {A 4 gr ?) produced by dis- 
junctive decomposition forms a complete set of under-approximations. 
Therefore, the subautomata do not produce conservative results for lan- 
guage emptiness checking. 

One advantage of applying this disjunctive decomposition theorem 
is that it gives us the ability of checking each of these new automata 
separately. When we can restrict the search to a smaller state space, we 
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increase the effectiveness of applying don't cares to speed up symbolic 
image and pre-image computations. 


6.3 Guided Search for Fair Cycles 


Theorem 6.3 allows us to disjunctively decompose the concrete system 
into subautomata {A JJ g), where each subautomaton Gf? is an initial- 
fair path in the SCC quotient graph of A’. Since each of these subgraphs 
corresponds to a depth-first search path in the SCC graph and contains 
a set of abstract counterexamples, it is also called a hyperline. Our fair 
cycle detection algorithm goes through all these hyperlines and checks 
language emptiness on each of them in isolation. 

Computing hyperlines requires not only all fair SCCs of A’, but also 
the non-fair SCCs. These non-fair SCCs can be computed with scc- 
DECOMPOSE, and just like the fair ones, they can also be computed in- 
crementally. Although it does not happen often in practice, in the worst 
case, the number of hyperlines in an SCC graph—a DAG—is exponen- 
tial in the size of the graph. In order to avoid an excessive partitioning 
cost on the over-approximations, with the consequent exponential num- 
ber of hyperlines, we apply the following heuristic control of the size of 
the SCC graphs: 


w skip SCC-DECOMPOSE on $ if S is non-fair in G(A’) and its size (num- 
ber of concrete states) is below a certain threshold; 


= switch to the endgame if the number of edges of the SCC graph G(A’) 
exceeds a certain threshold; 


w switch to the endgame if the number of fair nodes of the SCC graph 
G(A’) exceeds a certain threshold. 


With such a heuristic control, the number of hyperlines is bounded by 
a constant value. 

In the endgame, we disjunctively decompose the exact state space 
into subspaces according to the different hyperlines of the last abstract 
model. Every hyperline or G;’ induces a subautomaton of the exact 
system. Although subautomata may share states, we can avoid visiting 
any state more than once by keeping a global set of visited states. 

Within A J} ge ?; we search for cycles that are both reachable and fair. 
Although reachability analysis shares the same worst-case complexity 
bound with the best cycle-detection algorithm, in practice it is still much 
cheaper. This motivates us to always make sure a certain state subspace 
is reachable before deploying the cycle detection procedure, in order to 
avoid searching unreachable states for a fair cycle. 
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In particular, fair cycle detection is triggered only after the reacha- 
bility analysis hits one or more promising states—states that are in fair 
SCC-closed sets and at the same time satisfy some acceptance conditions. 
Recall that the symbolic SCC enumeration algorithms [BGS00, BGS05, 
GPP03] used in SCC-DECOMPOSE compute an SCC by first choosing 
a seed. The promising states encountered during forward reachability 
analysis are good candidates for the seed. We can also give higher pri- 
ority to promising states that are reached earlier in forward reachability 
computation, in the hope of getting a shorter counterexample. In ad- 
dition to the order in which they are encountered during the forward 
search, promising states can also be prioritized according to the number 
of acceptance conditions they satisfy: if two promising states are hit si- 
multaneously by the forward search, whichever satisfies more acceptance 
conditions is preferred. By prioritizing the seeds as described above, we 
heuristically choose the SCC that is expected to be closer to the initial 
states and more likely to be fair; this may reduce the number of reach- 
able states traversed by forward reachability search and may lead to a 
shorter counterexample. In prior art, the algorithm in [HTKB92] was 
also designed to avoid visiting too many reachable states in the search 
for fair cycles, but their approach was significantly different from ours. 


Although disjunctive decomposition has divided the entire state space 
into smaller pieces, the reachable states of each subautomaton may still 
be many. The ideal way of finding a fair cycle is to traverse only part of 
the reachable states of the subautomaton, and go directly to a promising 
state to start the SCC enumeration. To reach a promising state with the 
least possible overhead, i.e., by traversing the least number of reachable 
states, we need some guidance for the targeted search. The intermediate 
results of the reachability analysis of A’ can provide guidance for such a 
targeted search. Reachability analysis with Breadth-First Search (BFS) 
gives a set of reachability onion rings, denoted by (R9, R!,..., RÌ}; each 
ring is the set of states at a certain distance from the initial states. For 
example, a state in R? can be reached from an initial state in two steps 
but not less. Suppose that R? is the earliest ring that contains a promis- 
ing state, one wants to spend as little effort as possible in traversing 
states in R! and R?. 


We now present a guided search algorithm for fair cycle detection, 
called DETECT-FAIR-CYCLE (Figure 6.1) [WH02]. The procedure is called 
by MODEL-CHECK for every hyperline G’. There are two global variables, 
Reach and Queue, representing the set of already reached states and the 
SCC-closed sets that remain to be inspected, respectively. The reachable 
onion rings of A‘, denoted by absRings or (Ri), are used to estimate 
the distance of an SCC closed set to the initial states. We use the dis- 
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tance to rank the relative importance of SCC-closed sets in the priority 
queue Queue. The procedure SCC-DECOMPOSE-WITH-ET searches the 
SCC sets one by one for fair cycles; the SCC closed set closest to initial 
states (measured by the distance in the abstract onion rings) is always 
picked up first. The global reachable state set Reach may be updated 
after each SCC closed set is inspected, if states not yet in Reach have 
been discovered by the forward search of SCC enumeration. The entire 
procedure terminates when either all reachable states in all Asub are 
visited, or a fair cycle is found. 

Instead of using the conventional IMAGE computation, we use a heuris- 
tic algorithm called sharp image computation for the targeted reachabil- 
ity analysis. The goal is to reach a promising seed state without visiting 
all the reachable states in lower onion rings. The pseudo code for a 
sharp image computation is also given in Figure 6.1. Let D be the set of 
states for which we want to find the successors (from set), and {R'} be 
the set of reachable onion rings from an abstract model. The procedure 
first finds the abstract onion ring that is closest to the target and at 
the same time intersects D. The intersection of this ring and D has the 
shortest approximate distance to a promising state. This set is further 
compacted into D by BDD-SUBSETTING [RS95, PH98]. As a generic 
function, BDD-SUBSETTING can return a minterm, a cube, or an arbi- 
trary subset of (D N R:) with a small BDD representation. Finally, the 
image of D# is computed with the conventional IMAGE operation. It is 
clear that the result is a subset of EY(D). 

Our guided search procedure with sharp image computation is differ- 
ent from the high-density algorithm of [RS95], because our goal in com- 
pacting the from set is to get closer to the fair SCCs, not to increase the 
density of its BDD representation. Nevertheless, our approach shares 
a common problem with high-density search—namely, how to recover 
from deadends. Since IMAGE computes only a subset of the exact im- 
age, it is possible for the frontier set, Front, to be empty before the 
forward search actually reaches a fixpoint. Whenever this happens, we 
need to backtrack and recompute the standard image of Reach using 
the transition relation of Asub. 

Once we encounter some promising states during the targeted reacha- 
bility analysis, we pick a seed from them and start the SCC computation. 
If the SCC containing this seed intersects all the fair sets F; € Z, we 
can conclude that the SCC is both reachable and fair, and terminate the 
entire procedure immediately. If the SCC is not fair, it is merged into 
the set of already reached states in Reach (because the SCC has proved 
to be reachable) before the targeted reachability analysis is resumed. 
Since every SCC found in this way is guaranteed to be reachable, the 
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DETECT-FAIR-CYCLE(.A, A’, G’, Reach, Queue) 
í // model, abs model, hyperline, 
// reached states, and scc-closed sets 
sub =A VO 
Asup = A 1 6*5; 


absRings = COMPUTE-REACHABLE-ONIONRINGS(A\, ;); 
Front — Reach; 


while (true) 


( 
while (Front Z 0) and (Front N Queue = 0) 
Front = IMAGE? (Asub, Front, abs Rings) V Reach; 
if (Front = () 
Front = IMAGE( Asub, Reach) N Reach; 
Reach = Reach U Front; 
j 
if (Front = () 
return false; 
if (SCC-DECOMPOSE-WITH-ET(Ag.5, Queue, abs Ríngs)) 
return true; 
) 
) 
IMAGE (A, D, ( R:)) 
iem 
while (D n R: = 0) 
( 
$—i—1; 
) | 
D# = BDD-SUBSETTING (D n R:); 
return IMAGE(A, D^); 
) 


Figure 6.1. Guided search of fair cycles and sharp image. 
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SCC enumeration algorithms [BGS00, BGS05, GPP03] can be further 
enhanced with early termination [SRB02]: they terminate as soon as 
a, fair cycle is found, as opposed to after both the forward and back- 
ward search from the seed reach their fixpoints. In lockstep [BGS00], 
for instance, this requires that after each forward and backward step, 
we check whether the intersection of the forward and backward results 
satisfies all the fairness conditions—if it does, the union of the forward 
and backward search results contains a reachable fair cycle. 

When the language is indeed empty, all reachable states of the subau- 
tomata must be traversed. Let n4 be the number of reachable states of 
the exact system, and let the total number of hyperlines be a constant 
value; then, the cost of deciding reachability in our guided search pro- 
cedure is O(n). The total cost of fair cycle detection depends on the 
underlying symbolic SCC enumeration algorithm used in DETECT-FAIR- 
CYCLE, which we will analyze in the next section. 


6.4 Implementation and Experiments 


6.4.1 Complexity Analysis 


The refinement algorithm described thus far cannot improve the com- 
plexity bound of the language emptiness check. On the other hand, 
it does not make the theoretical complexity bound worse. In the fol- 
lowing, we show that the complexity of our incremental approach is 
within à constant factor from that of the non-incremental one; this 
means that it is O(r4) when the linear time algorithm of [GPP03] is 
used in SCC-DECOMPOSE, or O(nAlogn4) when the lockstep algorithm 
[BGS00, BGS05] is used. 

In the following theorem, we assume that the linear time algorithm of 
[GPP03] is used. 


THEOREM 6.4 If the set L of approximations can be partitioned into 
subsets L1,..., Ly such that, for some constant A, 


1 |L,| < À; 
2 for every A’ € Li, nar < 2°; and 
3 A€L. 


then the generic SCC refinement algorithm runs in O(n,) steps. 


PRoor: Both reachability computation and SCC enumeration take a 
linear time, so the total cost of SCC analysis for A' is bounded by kna, 
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for some constant k. Let the number of effective states of .A' be denoted 
by nar, then f 
"nA SNAŽ . 
Hence, the cost of analyzing all approximations and A itself is bounded 
by 
knA(A + A/2 4- A/A +++ + A/27) , 
which is bounded by 2Akr A. 


While we cannot hope for an improved run time in the worst case, we 
expect that the refinement based approach will be beneficial when the 
state space breaks up into many small SCC-closed sets. 

In some special cases, we can prove the following linear complexity 
result—even when the n logn algorithm of [BGS00, BGS05] is used for 
SCC enumeration. 


THEOREM 6.5 Under the assumptions for L of Theorem 6.4, if for some 
constant y, the pairs (S, A') passed to SCC-DECOMPOSE satisfy |S| < 
yYnA/"4 , then the refinement algorithm runs in O(n4) time. 


PROOF: The analysis of A consists of the decomposition of SCC-closed 
sets of size bounded by y. Their number is linear in 4, and each decom- 
position takes constant time. Hence, the total time for the analysis of A 
is O(n). If |C| is the number of states in SCC C of A’, then |C|n.y /na 
is the effective size of C. The cost of analyzing A’ is therefore O(n). 
With reasoning analogous to the one of Theorem 6.4, one finally shows 
that the total time is also O(n4). 


6.4.2 Experiments on SCC Refinement 


First, we describe the details of two implemented policies for the SCC 
analysis algorithm D’n’C. Both versions implement the basic popcorn- 
line approach, and therefore correspond to a breadth-first search of the 
SCC refinement tree. The set of submodules are generated and then or- 
dered according to a static refinement scheduling strategy of [Jan99]. It 
partitions the entire set of state variables into many smaller clusters. The 
partitioning is based on the structural information of the model (e.g., 
latch connectivity). Each cluster is considered as a submodule, and the 
parallel composition of all these submodules is the concrete system. The 
submodules are heuristically sorted according to their distances from the 
state variables appearing in the property automaton. 

The two implemented D’n’C policies differ in when to switch to the 
endgame: the first policy de-emphasizes compositionality in comparison 
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to strength reduction by performing only two levels of composition. At 
the first level, it computes the SCCs of the property automaton, and 
at the second level, it composes all the other modules of the system. 
The second policy tries to exploit the full compositionality implied by 
Figure 5.5 and 5.6. To avoid too much overhead on analyzing the over- 
approximations, it heuristically stops the refinement at some point, and 
then immediately composes all the remaining modules, thus proceeding 
directly to the exact system. In the implementation, we stop the linear 
composition after 3096 of the state variables have been composed. Once 
the exact system is reached, the EMERSON-LEI algorithm is applied to 
its SCC-closed sets. For ease of reference, we refer to the first policy as 
the T'wo-level method, and to the second as the Multi-Level method. 

In both policies, weak SCCs are grouped together and are checked for 
cycles in the concrete system immediately after they are discovered. The 
underlying assumption for the special handling of weak/terminal SCCs 
is that model checking these SCCs is cheaper in the concrete model. 
If D’n’C finds a concrete fair cycle, it terminates, otherwise it discards 
these SCCs. At any abstraction level, if no SCCs are present, the al- 
gorithm also terminates because there is no cycle in the concrete model 
either. 

The proposed algorithm has been implemented in the symbolic model 
checker VIS [B^ 96][VIS]. The results of Table 6.1 were obtained by ap- 
propriately calling the standard Language Emptiness command of VIS. 
SCC analysis was performed with the lockstep algorithm of [BGSO00]. 
(Separate study showed that the algorithm of [GPP03] had a perfor- 
mance slightly worse than lockstep [BGS00, BGS05] in practice, because 
of its additional bookkeeping overhead.) Prior reachability analysis re- 
sults were used as don't cares where possible. 

In Table 6.1, all examples were run with the same fixed BDD order, 
which had been obtained with previous runs of dynamic variable re- 
ordering. The experiments were conducted on an IBM Intellistation run- 
ning Linux with a 400MHz Pentium II processor with 1GB of SDRAM. 
For the same set of models and property automata, a second table was 
also obtained with dynamic variable ordering turned on for each exam- 
ple. Similarly, a third table was obtained using the EL2 variant of the 
Emerson-Lei algorithm [HTKB92]. The second and third tables were 
omitted for brevity, since the results were not significantly different. 
(The only exception to the statement was the fact that the example 
nmodem1 took only 209 seconds with EL2, versus 4384 for the original 
Emerson-Lei algorithm.) 

Table 6.1 has four columns. The three fields of the first column give 
the name of the example, a symbol indicating whether the formula passes 
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(P: no fair cycles exist) or fails (F: a fair cycle exists), and the number 
of binary state variables in the system. The three fields of the second 
column, obtained by directly applying the VIS Emerson-Lei algorithm, 
give: 


1 the time it took to run the experiment (T/O indicates a run time 
greater than 4 hours); 


2 the peak number of live BDD nodes (in millions); and 
3 the total number of pre-image (EX) / image (EY) computations needed. 


These same field descriptors also apply to the third and fourth columns 
for the Two-Level and Multi-Level versions of the D'n'C algorithm, ex- 
cept that the latter has an additional field that indicates how the verifi- 
cation process terminates: 'n' means that the algorithm arrives at some 
intermediate level of the refinement process in which there no longer 
exists any fair SCC; ’w’ means that there is a weak fair SCC found and 
it contains a fair cycle. 

The property automata being used in the experiment are translated 
from LTL formulae. In order to avoid bias in favor of the new approach, 
each model is checked against a strong LTL property automaton. Note 
that the presence of the ’n’ or ’w’ in the last field demonstrates that both 
pruning of the SCC refinement tree and strength reduction are active in 
these experiments. 

Comparing the D'n'C algorithm to the one by Emerson and Lei, we 
find that, with only three exceptions out of 18 examples, there is a sig- 
nificant (more than a factor of 2) performance advantage for the D'n'C 
algorithm. Comparing the Two-Level and Multi-Level versions, one sees 
that with four exceptions (eisenb2, philo2, philo$, and shamp2), the two 
policies give comparable performance. This is because most of the exam- 
ples are simple mutual-exclusion and arbitration protocols, in which the 
properties have little localities. We expect the compositional algorithm 
to do even better on models with more localities. On the other hand, we 
have found that the greater compositionality of the Multi-Level version 
proves its worth, especially on the larger examples. 


6.4.53 Experiments on Disjunctive Decomposition 


Now we describe the details of another implemented policy for disjunc- 
tive decomposition and the targeted search for fair cycles. This policy 
is a variant of the popcorn-line approach, with breadth-first search of 
the SCC refinement tree. Before jumping to the exact system, it trims 
the fair SCC-closed sets further on the remaining submodules by the 
Cartesian product approach. The enhanced algorithm, called D’n’C*, 
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has been compared to D’n’C on the same set of test cases to study the 
effectiveness of the added feature. The experiments were conducted on 
a 400MHz Pentium II processor with 1GB of SDRAM. 

In Table 6.2, prior reachability analysis results were used as don’t 
cares where possible. Note that results in this table are obtained in a 
relatively ideal case—it assumes that the exact reachability computation 
results are available. The table has four columns. The three fields of 
the first column give the name of the example, a symbol indicating 
whether the formula passes or fails, and the number of binary state 
variables in the system. The next three columns compare the run time, 
the total memory usage, and the peak number of live BDD nodes of 
the three methods. Comparing the D’n’C* algorithm to D’n’C, we find 
three wins for D’n’C# and 15 wins for D’n’C. This indicates that the 
disjunctive decomposition is encumbered by overhead of maintaining and 
decomposing the SCC graph. However, among the 15 wins of D’n’C, only 
four are for problems requiring more than 100 seconds to complete— 
that is, the easy problems. In contrast, on the three wins of D’n’C#, 
D'n'C took 1337, 1683, and 233 seconds. Therefore, we conclude that 
in general the additional overhead of disjunctive decomposition is not 
significant. On the harder problems, D'n'C* is as competitive as D’n’C 
when advance reachability analysis is feasible. 

In Table 6.3, the same set of test cases has been checked with the 
approximate reachability analysis results as don't cares where possible— 
that is, with ARDCs as opposed to RDCs. We note that approximate 
reachability analysis is usually much faster than exact reachability analy- 
sis, and in practice, may be the only feasible way of extracting don't cares 
from large models. The table has four columns. The three fields of the 
first column repeat the description of the test cases. The next three 
columns compare the run time, the total memory usage, and the peak 
number of live BDD nodes of the three methods. Comparing the D'n'C* 
algorithm to D'n'C, we find 12 wins for D'n'C* and 6 for D’n’C. In ad- 
dition, all the 6 wins for D'n'C are for problems requiring less than 100 
seconds to complete; in contrast, D'n'C* wins more on the harder ones— 
on 8 out of its 12 wins, D'n'C timed out after 4 hours. The difference 
here is that both D’n’C and EL depend heavily on full reachability to 
restrict the search spaces, but the disjunctive decomposition and sharp 
guided search of D'n'C* minimize this dependency. 

We also conducted experiments on a set of much harder test cases, the 
Texas-97 benchmark circuits. The property automata being used in the 
experiments were also translated from LTL formulae. The experiments 
were run on an IBM Intellistation with a 1700MHz Pentium-IV processor 
and 2GB of SDRAM. 'The results are given in Table 6.4. 
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In Table 6.4, the comparison is with the results of approximate reach- 
ability analysis as the don't cares where possible. Note that exact reach- 
ability analysis is infeasible for most of these circuits, except for MSI. 
The table has four columns. The three fields of the first column give the 
name of the example, a symbol indicating whether the formula passes or 
fails, and the number of binary state variables in the system. The next 
three columns compare the run time, the total memory usage, and the 
peak number of live BDD nodes of the three methods. Comparing the 
D’n’C* algorithm to D’n’C, we find five wins for D'n'C* and two for 
D’n’C. Again, the two wins for D’n’C are easier problems, and the five 
wins for D’n’C* are much harder—among them, two cannot be finished 
by D’n’C within 8 hours. Therefore, this table demonstrates a decisive 
advantage of the D’n’C* algorithm over both D’n’C and EL. 


6.5 Further Discussion 


We have shown that over-approximations of the concrete system can 
be used to gradually refine the SCC-closed sets to SCCs. The D'n'C 
algorithm has the advantages of being compositional, considering only 
parts of the complete state space, and taking into account the strength 
of an SCC to deplore the proper model checking algorithm. We have dis- 
cussed the different policies in traversing the lattice of over-approximated 
systems. In comparison to the original Emerson-Lei algorithm, the new 
algorithm has demonstrateed significant and almost consistent perfor- 
mance improvement. 'This indicates the importance of the three im- 
provement factors built into the proposed algorithm: SCC refinement, 
compositionality, and strength reduction. 

We have also shown that the analysis of SCC quotient graph of an 
over-approximated system can be used to decompose the concrete search 
state space. Based on disjunctive decomposition, our guided search al- 
gorithm for fair cycle detection demonstrates further performance im- 
provement. Our experiments show that, for large systems or otherwise 
difficult problems, heavy investment in these heuristics is well justified. 

The simplicity of the implemented policies in comparison to the gen- 
erality of our framework suggests that there can be many promising 
extensions and variations. The joint application of over- and under- 
approximations of the concrete system, for instance, can be an interest- 
ing future work. 

'The generic framework can be highly parallelized by assigning dif- 
ferent entries from the Work list, as well as the disjunctive state sub- 
spaces, to different processors. Processors that deal with disjoint sets of 
states have minimal communication and synchronization requirements. 
Although the algorithm is geared towards BDD based symbolic model 
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checking, SCC refinement can also be combined with explicit state enu- 
meration and SAT based approaches. 
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Chapter 7 


FAR SIDE IMAGE COMPUTATION 


In the next two chapters, we will apply the idea of abstraction fol- 
lowed by successive refinements to two basic decision procedures in for- 
mal verification, BDD based image computation and Boolean satisfia- 
bility check. Image computation accounts for most of the CPU time in 
symbolic model checking, while a Boolean SAT solver is the basic work 
engine in bounded model checking. 

In image computation, the peak BDD size is the controlling factor 
for the overall performance of the algorithm. Don't care conditions have 
been routinely used to the present-state variables to minimize the transi- 
tion relation. However, the use of don't cares to the far side, or next-state 
variables is often ineffective. In this chapter, we present a new algorithm 
which computes a set of over-approximated images and apply them as 
the care sets to the far side of the transition relations. The minimized 
transition relation is then used to compute the exact image. 


7.1 | Symbolic Image Computation 


Image computation is the most fundamental operation in BDD based 
symbolic fixpoint computation. It has been extensively used in sequen- 
tial system optimization and formal verification. Given a state transition 
system, image computation is used to find all the successors of a given 
set of states according to a set of transitions. Existing algorithms for 
computing images fall into two categories: one is based on the transi- 
tion function [CBM89a], and the other is based on the transition rela- 
tion [GB94, RAB*95, MHS00, CCJ*01b, JKS02]. In this chapter, we 
focus on the transition relation based methods. 

Except for small systems, the transition relation (TR) can not be 
represented by a monolithic BDD. Instead, it is usually represented by 


122 


a, collection of BDDs (called clusters) whose conjunction is the entire 
transition relation. This representation is called the partitioned transi- 
tion relation. When the partitioned transition relation is used, the image 
is computed by conjoining the given set with all the transition relation 
clusters and then existentially quantifying the present-state variables 
and inputs. 

Given a partitioned transition relation T = (T) and a set of states 
D, the image is computed as follows: 


IMG(T, D) = 3z,w. Á T(z,w,y) ^ D(x) . (7.1) 
1<i<k 


The performance of this computation depends heavily on the size of the 
BDDs that represent the set of states, the transition relation, and the 
intermediate products during the evaluation of this quantified Boolean 
formula. 

In the conjoin-quantify operation (Equ. 7.1), the way in which tran- 
sition bit-relations are grouped into clusters, and the order in which 
variables are quantified are all important. The problem of clustering 
and ordering to minimize the peak size of the intermediate products is 
called the quantification scheduling problem. A technique called early 
quantification is often used to exploit the fact that each T* usually de- 
pends on a subset of the present-state variables and inputs, and some of 
these variables can by quantified out before all the clusters are conjoined. 


Let Qi, ..., Qk bea partition of the set (zUww) of present-state variables 
and inputs, then conjoin and quantify can be interleaved as follows: 


IMa(T,D(z) = 3Q.(I*(z,w,y*) A { 
AQ Tay Ag 


3Qi (TH (z, wy) A D(a) }}}} - 


Temporary results produced in the middle of the computation, such 
as Qi. (Tl(z,w,y!) ^ D(x)), are called intermediate products. The 
peak sizes of the intermediate products are often essential in determining 
whether a given symbolic image computation can be completed on a 
given computer. 

Studies on the effect of early quantification can be traced back to 
the early work of [TSL'*90, Bur91, GB94]. The quantification schedul- 
ing problem was proved to be NP-complete in [HKB96]. A practi- 
cally successful heuristic algorithm, known as IWLS95, was proposed 
in [RABt95]. The algorithm goes as follows: first, a heuristic score is 
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used to order the transition bit-relations; second, these bit-relations are 
linearly clustered together until the BDD size exceeds a certain thresh- 
old; finally, the clusters are ordered according to the same heuristic 
score. The IWLS95 algorithm is a representative of a class of linear 
quantification schedules. Recent progress along this line of research 
includes the algorithm based on the Minimum Lifetime Permutation 
(MLP) [MHS00, CCJ 01b] and the Fine-Grain image algorithm [JKS02]. 
Alternatively, image computation can be regarded as a problem of con- 
structing an optimal parse tree for the image set. This results in more 
general quantification schedules [HKB96, GYAGO00, CCJ* 01a]. 

The new method we introduce in this chapter is not another heuris- 
tic for the quantification scheduling problem. Instead, it provides a 
higher-level framework that can be implemented on top of any of these 
heuristics. 

Exact or approximate reachable states have been commonly used as 
don't cares in symbolic model checking to help the pre-image computa- 
tion. Transitions from unreachable states can be added or removed in 
order to reduce the BDD size of the transition relation without chang- 
ing the results of fixpoint computations restricted to reachable states. 
The constrain and restrict operators [CBM89b, CM90] are often used 
in this context to accomplish the BDD minimization. Both of these two 
operators are specific instances of a more general operation called the 
generalized cofactor [SHSVB94, HBLS98]. A generalized cofactor of a 
function T with respect to a set R, denoted by T" = T |} R, can be any 
characteristic function in the interval 


(TAR)€T'x(Tv-BR). 
An important property of the generalized cofactor is 


TAR=TAR. 


Generalized cofactors heuristically make the choice so that the BDD 
of T” is minimized in some sense. Therefore, the operation indicated by 
(T 4 R) is called BDD minimization. In practice, BDD minimization 
must be applied very carefully for it to be effective. When R has a large 
BDD or when R contains many variables that do not appear in T, BDD 
minimization using either restrict or constrain is ineffective. This is 
precisely the case when one tries to minimize a sub-relation T (a, Ww, y) 
with respect to the set R of (approximate) reachable states in next-state 
variables, because T’ often contains few next-state variables, but the set 
of R contains most of the next-state variables. Previously, simplification 


124 


of the transition relation by applying reachability don't cares to the far 
side has not been in common use. 


7.2 "The Far Side Image Algorithm 


In this section, we present a new image computation algorithm which 
applies approximate reachability don't cares to the next-state variables 
of the transition relation instead of the customary present-state vari- 
ables. Two problems may arise when one tries to modify the far side 
of the transition relation: First, if a transition is added from a reach- 
able state into a non-reachable state, the result of image computation is 
changed. Second, minimizing the BDD representation of the transition 
relation on the far side with the entire approximate reachable states is 
not effective due to the reason given above. Both problems are solved 
by the proposed algorithm. For the first problem, we show how the 
error states introduced by spurious transitions can be eliminated from 
the result of each image computation. To solve the second problem, we 
use local approximations of the reachable states, which are practically 
effective at simplifying the transition relation representations. 

The new algorithm, called FARSipEIMG [WHS03], is presented in Fig- 
ure 7.1. The algorithm takes as arguments the partitioned transition 
relation (T*) and the set D of states, and returns the exact image set of 
the given states. In the pseudo code, the procedure IMAGE represents a 
generic image computation procedure, which computes the image using 
Equ. 7.1. Given the appropriate quantification scheduling, the procedure 
IMAGE can represent any of the transition relation based image computa- 
tion methods described in [RAB*95, MHS00, CCJ*01a, JKS02]. In this 
sense, our FARSIDEIMG algorithm can be built on top of any transition 
relation based quantification scheduling. 

Since the transition relation T' is a conjunction of the individual transi- 
tion relation clusters, each cluster T* is considered an over-approximation 
of T'. Based on this observation, we first compute a series of upper-bound 
images, one for each transition relation cluster, as follows, 


Ri (y) = 3z,w.T'(z,w,y) ^ D(z) . 


Since D does not contain any input variable, w can be existentially 
quantified out of T* before it is conjoined with D. A similar argument is 
applied to the present-state variables that appear only in D (represented 
by Qa) and those that appear only in T“ (represented by Qg). The often 
small BDD size of T* and the early quantifications of w, Q4, and Qp 
make RI much easier to compute than the exact image R. 

Next, the BDD of each T” is minimized with respect to the correspond- 
ing approximated image. It is important to notice that we could have 
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FanSipEIMG((T*), D) 


ten! 


1 foreach ic {1,...,k} do 

2 xr — present-state variables in the support of T* 
3 Zp — present-state variables in the support of D 
4 Qa — {zp} N (zr) 

5 Qs — (xr) \ {xp} 

6 Qc — (zr) n {xp} 

7 Rt = 3Qc .(Sw, Qp .1*) A (SQA. D) 

8 T = Tš | Rt 

9 od 

10 R= IMAGE ({T*}, D) 

11 R=RAARI // clipping 

12 return R 


) 


Figure 7.1. The Far Side image computation algorithm. 


used the set A; R7 instead of R7 to minimize T". In theory, a smaller 
care set (or a larger don't care set as in this case) provides more degree of 
freedom for minimization. However, the subsets of next-state variables 
in different Rt (y*) are disjoint. No next-state variable of RF (y! ), where 
j #4, appears in T*(z, w, y*). This makes the minimization of T* with 
respect to the set Á RT ineffective. On the other hand, the local approxi- 
mation R. (y*) contains only the next-state variables of TG, vw, $^), and 
both of them typically depend only on a small subset y* of the next- 
state variables. Heuristic algorithms like constrain and restrict perform 
much better in practice when minimization is with respect to RT (y). 
Among the two operators, restrict is more robust because it prevents 
unwanted BDD variables from appearing in the result. Therefore, we 
will use restrict in our implementation and experimental investigation. 

The minimized transition relation cluster is a characteristic function 
within the interval 


(TART) < Tt <(Tiv-R). 
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Minimization can be regarded as adding or removing transitions point- 
ing to RS. as is illustrated in Figure 7.2. It may add, for instance, 
transitions that are pointing to RT , as represented by the dotted lines. 
Likewise, it may remove from T* transitions that are pointing to ART 
as described by the solid line marked by a cross. 








Figure 7.2. Minimizing the transition relation. 


_ Finally, we compute another over-approximation of the overall image, 
R, by applying the generic image computation to the minimized transi- 
tion relation. R contains all the states of the exact image (represented 
by R) and possibly some states in aRt due to the added transitions. 
We use the clipping operation of Line 11 to get rid of those error states 
by conjoining R with all the other over-approximations. 

The following theorem establishes the correctness of the new algo- 
rithm. 


THEOREM 7.1 FARSIDEIMG computes the same image set as IMAGE 
does. That is, given a partitioned transition relation {T"} and a set D 


of states, f f 
FARSIDEIMG({T"}, D) = IMAGE((T*Y, D) . 


PROOF: Let the result of IMAGE((T*), D), as described by Equ. 7.1, 
be denoted by R(y). Since the images computed in Lines 2-7 are over- 
approximations, we have 


RIGAR) = R). 
Because of the definition of generalized cofactors, 


TinR} = TiaR} . 
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By Lines 10-12 in Figure 7.1 
FARSIDEIMG({T"(a, w, y"), D(z)) 
= Ry) ^ A Ri (y^) 
= Imc(/\ Ti(z,w, y'), D(z)) A À RI (u) 
= (n, w. A (s, w, y) ^ D(z)) ^ A Rf (y) 
because R (y*) does not depend on x and w 
= 3z,w (A T (s, w, y^) ^ REG’) ^ D(z)) 
by the property of the generalized cofactor 
= 3c, w {A T (x, w, y") ^ Rt (y') ^ D(z)) 
= (3z,w. A T'(z,w,y^) ^ D(z)) ^ A Rf (y) 
= IMG(A T(z, w, y"), D(z)) ^ A Rf (y) 
= R(y) ^ A Ri (y^) 
= Rly) 
= Imc(/\ T" (x, w,y'), D(x)) 


7.3 Experiments 


We have implemented the FARSIDEIMG procedure in the symbolic 
model checker VIS 2.0 [B^96, VIS], on top of both the MLP image 
computation algorithm [MHS00] and the Fine-Grain image computa- 
tion algorithm [JKS02]. The new algorithm was compared with the 
standard MLP algorithm and Fine-Grain image computation algorithm 
in the reachability analysis of 35 circuits from public domain as well as 
industry. The "S" circuits come from the ISCAS'89 benchmark [ISC], 
the *D" circuits come from industry, and the others come from the VIS 
verification benchmark [VVB]. All the experiments were conducted on 
an IBM IntelliStation with a 1.7 GHz Pentium IV CPU and 2GB of 
RAM. The data size limit for each process was set to 750MB. 

Table 7.1 shows the comparison of the run time and memory usage 
of the Far Side algorithm and MLP, with dynamic variable reordering 
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method “sift”. The image cluster threshold has been set to the default 
value, 5000. Columns 1-3 are the name, the number of binary state 
variables, and the number of inputs of each circuit. Columns 4-6 compare 
the CPU time; Columns 7-9 compare the peak number of live BDD nodes 
during the image computations. Note that none of the two methods can 
complete the last 5 circuits, for which the run time and peak live BDD 
nodes are up to the last step reached by both methods (indicated by the 
number in parentheses in Column 1). The data of D14, for instance, are 
up to 12 steps, as indicated by (12) in Column 1. Within the 8 hours 
time limit, FARSIDEIMG was able to finish one more steps than MLP 
(indicated by [13] in Column 5). 

The total run time of the 35 examples was 171,876 seconds for the 
original MLP, and 114,710 seconds for FARSIDEIMG. Overall, this is a 
33% improvement for FARSIDEIMG. However, note that MLP ran out 
of time on am2901 and palu, which means that the 3396 win is a lower 
bound. 

Another way to analyze the data is to partition the examples into 
groups and compare the performance on different groups. Group “easy” 
consists of circuits whose reachability analysis can be finished within 15 
minutes (the first 15 circuits); Group “hard” consists of circuits whose 
reachability can be finished by at least one method within 8 hours; Group 
"harder" consists of the rest of the circuits. The average run time and 
the geometric mean of the peak live BDD nodes of the two algorithms 
are compared separately for the three groups as follows: 




















Group “easy” Group "hard" Group "harder" 
MLP  FanSmE % MLP FarSIDE % MLP  FanSmE % 
CPU(s) 281 280 0 6871 4311 +387 | 14233 9972 +30 
BDD(k) 157 161 -2 1181 845 425 3348 2729 +18 








These data show that the run time and the peak BDD size for Groups 
“hard” and “harder” average an order of magnitude larger than those 
for Group “easy”. On the “easy” problems, FARSIDEIMG does not win 
because of its additional overhead in approximation and refinement. On 
the “hard” and “harder” problems, controlling BDD size by applying 
Don't Cares to the far side inside image computation pays off. 

Note that in one anomalous “hard” circuit, prolog, MLP outperforms 
FARSIDEIMG by more than a factor of three. We believe that the anom- 
aly is due to the noise introduced by the BDD dynamic variable reorder- 
ing during reachability analysis. To verify this conjecture, the BDD 
orders at the end of reachability analysis were stored, and with these 
fixed variable orders we re-ran all the experiments. (For those “harder” 
circuits whose reachability analysis can not be finished, the default fixed 
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Table 7.1. Comparing FARSIDEIMG and MLP with dynamic variable reordering. 
CPU (s) Peak BDD nodes (k) 

Design regs inputs MLP  FanSIDE 06 MLP FARSIDE % 
D12 48 16 6 7 -16 204 197 +3 
abs.fabr 87 21 26 20 4-25 43 46 -5 
D23 85 22 11 11 0 24 24 0 
nosel 128 65 30 32 - 5 57 53 +7 
bpb 36 9 33 53 -60 94 108  -14 
shampoo 140 21 55 79 -44 91 98 - 7 
soap 140 11 73 85 -17 101 97 +3 
3.proc 62 18 103 106 -2 213 183 +13 
soapLtl3 142 11 360 329 +8 341 296 +13 
s1512 57 29 364 435 -19 91 8 +2 
Feistel 293 68 541 567 - 4 159 229 -43 
Ds mo a 250 301 30 
DI woo 78 665 60 +8 
s4863o 88 35 582 510 +12 402 402 0 
cps13640 134 97 598 672 -12 421 447 - 6 
D21 92 6 626 610 +2 466 474 -1 
D2 94 6 1024 862 +15 765 745 +2 
cps1364 231 97 1198 971 +18 363 372 -2 
s4863 104 49 1274 1037 +18 749 602 +19 
Di w 22 SO 4 +e 
icctl 62 27 1462 1934 -32 1503 1515 0 
s5378opt 121 35 1476 552 +62 508 335 434 
FIFOs 142 7 2129 1907 +10 1098 1021 +6 
prolog 136 36 2443 7907 -223 1935 2287 -18 
$3271 116 26 2627 1788 +31 1085 820 +24 
s1269 37 18 3513 2958 +15 3588 3520 +1 
D22 140 20 9351 12821 -37 3356 2834 +15 
$3330 132 40 10733 2382 +77 3110 2922 +49 
am2901 68 27 | >28800 4827 >+83 - 2849 - 
palu 37 10 | >28800 19153 24-33 - 8985 - 
D14 (12) 96 21 19978 6099 [13] +69 4833 3066 +63 
D15 (31) 106 31 12021 9795 [35] +19 5855 4435 +24 
D16(16) 531 16 11264 6845 [17] 4-40 3142 3142 0 
D18 (23) 507 200 11994 11458 +4 1621 1385 +14 





D20 (7) 





562 31 15910 15666 + 1 2926 2561 +12 
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variable orders generated by VIS’s static.order command were used.) 
The results are shown in Table 7.2. 

With the fixed orderings, some circuits run much faster (such as pro- 
log), some run much slower (s5378opt is about 4 times slower), and some 
run out of time (such as s3271). It is important to notice that for the 
anomalous circuit prolog, FARSIDEIMG has a marginal win (94 seconds 
vs. 104; 7200k BDD nodes vs. 9967k). This confirms that the anomaly 
in Table 7.1 was due to noise introduced by dynamic variable reordering. 

With the fixed orders, the average run time (among those completed) 
was 576 seconds for MLP and 607 seconds for FARSIDEIMG. More im- 
portantly, there are now 3 circuits in Group “hard” that can no be com- 
pleted by either method because they run out of memory (M/O). The 
peak numbers of live BDD nodes are often an order of magnitude higher 
than those in Table 7.1. Given the fact that finding a good fixed order is 
hard in practice, it is generally accepted that dynamic variable reorder- 
ing is required when dealing with industrial-strength circuits. Therefore, 
we claim that the data with dynamic reordering is more significant. 

Our experiments with the Fine-Grain image algorithm of [JKS02) 
demonstrated a similar performance improvement for FARSIDEIMG. 


7.4 Discussion of Hypothesis 


Our hypothesis is that, the performance gain of FARSIDEIMG is due 
to the reduction of peak BDD size inside the conjoin-quantify operation. 
However, FARSIDEIMG focuses on minimizing the BDDs used inside the 
image computation, not the overall BDDs used in the reachable analysis. 
Therefore, the overall BDD data in the previous two result tables are 
less informative, since a large part of the BDDs are used for representing 
the accumulated reachable states. The set of accumulated reachable 
states can become quite large near the end of the reachability analysis. 
Therefore, we also performed experiments that attempted to measure 
data more relevant to the hypothesis. 

In Figure 7.3, FARSIDEIMG (solid line) and MLP (dotted line) are 
compared on two different parameters: the BDD size of the intermedi- 
ate products and the peak number of live BDD nodes. Note that the 
latter includes the BDDs representing the accumulated reachable states. 
The horizontal axis shows the image steps, from 1 to 43, indicating the 
sequential depth of 43. The upper figure shows that except for a few 
iteration steps (e.g., Steps 2, 3, 6, and 8), the minimization is effective 
at reducing the maximum BDD size of the intermediate products. 

Since run times are determined primarily by the maximum BDD size 
of the intermediate products, the upper part in Figure 7.3 is more in- 
structive in explaining the reason for the speed-up achieved by FAR- 
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Table 7.2. Comparing FARSIDEIMG and MLP with fixed variable ordering. 





CPU (s) 


Peak BDD nodes (k) 














Design regs input MLP  FARSIDE % MLP FARSmDE % 
D12 48 16 1 

abs.fabr 87 21 36 

D23 85 22 
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shampoo 140 

soap 140 11 

3_proc 62 18 

soapLtl3 142 11 55 99 -81 4760 4836 - 1 

s1512 57 29 837 1120  -33 24085 23469 +2 

Feistel 293 68 4 5  -29 744 64 +8 

D5 319 24 94 1i4  -21 4665 4158 +10 
8469 8469 0 




















































cps1364o 134 | 2080 1993 +4 
D21 92 6 | 224 305  -35 4649 3670 +21 
D2 94 6 248 326  -31 3268 3149 +3 
cps1364 231 97 25 24 42 2021 1572 +22 
s4863 104 49 59 35 439 822 867 -5 
D4 230 22 79 117  -47| 1035 1096  -5 
icctl 62 27 115 134  -15 3084 2268 +26 
s5378opt 121 6960 6593 24160 20221 +16 
FIFOs 142 7 444 424 + 4252 4710 -10 
prolog 136 36 104 94 +9 9967 7200 427 
s3271 — 116 36 MO MO - 
s1269 37 18 2668 2603 +2 44928 44928 0 
D22 140 20 2414 3158 -30| 1736 1716 41 
$3330 132 40 931 913 +1 24735 24736 0 
am2901 68 27 M/O M/O z M/O M/O = 
palu 37 10 M/O M/O Š M/O M/O = 
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Figure 7.8. s53780pt: The upper part is the BDD size of the intermediate products 
at different steps during the reachability analysis; the lower part is the total number 
of live BDD nodes, including BDDs representing the accumulated reachable states. 
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SIDEIMG. In Figure 7.3, the peak occurs at iteration 21, where the 
MLP size is about 3 times larger than the FARSIDEIMG size. The FAR- 
SIDEIMG size peaked near iteration 29, at which the MLP size was about 
the same. The curve with fixed variable orders is similar. 

Figure 7.4 shows the effect of the transition relation minimization 
by FARSIDEIMG, i.e., the BDD size reduction in percentage at different 
steps of the reachable analysis. From top down, these data are for cir- 
cuits s5978opt, prolog, and s3271. (Data for the other 32 circuits are 
similar.) Each graph has two curves: one for dynamic variable reorder- 
ing, and the other for fixed ordering. Note that 5096 on the curve means 
that the BDD size of the minimized transition relation is half of the 
BDD size of the original one. For the first few reachability steps, the 
reduction in TR size is substantial. As the iteration count grows, the 
size reductions saturate at a marginal value (0 to 40%). 

In the saturation phase (the right side of the curves), the reductions 
are greater when a fixed ordering is used. The data for s53780pt in 
Tables 7.1 and 7.2 show that even though the reductions never fell to 
less than 3096, reachability analysis is 5 times slower for MLP with fixed 
variable ordering than with dynamic variable reordering. This might 
appear to be anomalous since we have attributed time reductions with 
BDD size minimization. However, note that these data are percentages, 
not absolute values. The size of the minimized transition relation for 
fixed variable ordering is still much larger than the size of the minimized 
transition relation for dynamic variable ordering, as indicated by the 
minimized absolute values in the lower part of Figure 7.3. 

The plateaus for s53780pt correspond to calls by the BDD manager 
to the reordering routine (these occurred at iterations 10, 17 and 27). 
In between these calls, the reductions follow a saturating pattern similar 
to the curves for fixed BDD ordering. Sometimes there is a final phase 
of increased reduction (the right side of the curves), due to the fact that 
image size decreases near the end of the reachability analysis. (A smaller 
image makes a better constraint for minimization.) 

For prolog, reachability analysis is more than an order of magnitude 
faster with fixed ordering. (The size of the transition relation for fixed 
ordering is about 2 times larger than for dynamic variable reordering.) 
This is à case where the BDD reordering itself takes a larger proportion 
of the time. 

The bottom part of Figure 7.4 pertains to circuit 59271. With the 
fixed variable ordering, this circuit can complete only 4 iterations be- 
fore running out of the 750MB memory, whereas with dynamic variable 
reordering, it can complete all 17 iterations in only about 30 minutes. 
Although enabling the dynamic variable reordering makes it difficult to 
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Figure 7.4. The BDD size reduction of the transition relation, in terms of the ratio 
of the BDD size of minimized transition relation to the original BDD size. 
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isolate the effects of algorithmic improvement, it appears to be the only 
viable option for some hard models. 

To summarize, the performance improvement of the FARSIDEIMG al- 
gorithm based on compositional BDD minimization is significant on av- 
erage, and especially significant on difficult circuits. For circuits requir- 
ing more than 15 minutes to complete the reachability analysis, FAR- 
SIDEIMG, implemented on top of MLP, is significantly faster than the 
standard MLP in 16 out of the 19 cases. It is reasonable, therefore, to 
conclude that the new method is more robust in large industrial-strength 
applications. 


Chapter 8 


REFINING SAT DECISION ORDERING 


In bounded model checking, the series of SAT problems for check- 
ing the existence of finite-length counterexamples are highly correlated. 
This strong correlation can be used to improve the performance of the 
SAT solver. The performance of modern SAT solvers using the DLL 
recursive search procedure depends heavily on the variable decision or- 
dering. In this chapter, we propose a new algorithm to predict a good 
variable decision ordering based on the analysis of unsatisfiability proofs 
of previous SAT instances. We then apply the new decision ordering to 
solving the current SAT instance. By combining the predicted ordering 
with the default decision heuristic of the SAT solver, we can achieve 
a significant performance improvement in SAT based bounded model 
checking. 


8.1 Unsatisfiability Proof as Abstraction 


In bounded model checking, the existence of a finite-length counterex- 
ample is formulated into a Boolean SAT problem. The satisfiability 
of a Boolean formula can be decided by the Davis-Longeman-Loveland 
(DLL [DLL62]) recursive search procedure, which has been adopted by 
many modern SAT solvers. The basic steps in a DLL procedure are 
making decisions (assigning values to free variables) and propagating 
the implications of these decisions to the subformulae. 

Like many other search problems, the order in which these Boolean 
variables are assigned, as well as the values assigned to them, affects 
a SAT solver's performance significantly. Conceptually, different vari- 
able decision orderings imply different binary search trees, whose sizes 
and corresponding search overheads can be quite different. Because of 
the NP-completeness of the SAT problem, finding the optimal decision 


138 


ordering is unlikely to be easier; modern SAT solvers often use heuris- 
tic algorithms to compute decision orderings that are “good enough? 
for common cases. For instance, the SAT solver Chaff [MMZt01] uses 
a decision heuristic called Variable State Independent Decaying Sum 
(VSIDS), which is effective in solving many large industry benchmarks. 
Some of the pre-Chaff decision heuristics can be found in the survey 
paper by Silva [Sil99]. 

Most modern SAT solvers are designed to deal with general CNF for- 
mulae. Using them to decide the SAT problems encountered in bounded 
model checking requires the translation of the Boolean formulae into 
CNF. Unfortunately, useful information that is unique to BMC is of- 
ten lost during this translation. In particular, the set of SAT problems 
that BMC produces for an increasing counterexample length is made 
up of problems that are highly correlated; this means that information 
learned from previous SAT problems can be used to help solving the 
current problem. In this chapter, we propose a new algorithm to predict 
a good variable ordering for the SAT problems in BMC. This is a linear 
ordering computed by analyzing all previous unsatisfiable instances; the 
ordering is also successively refined as the BMC unrolling depth keeps 
increasing. We also give two different approaches (static and dynamic) 
to apply this linear ordering to SAT variable decision making. In both 
cases, the newly created ordering is combined with the default variable 
decision heuristic of the SAT solver to make the final decisions. 

Recall that whenever a Boolean formula is proved to be unsatisfiable 
by a SAT solver, there exists a final conflict that cannot be resolved by 
backtracking. Such a final conflict, represented by an empty clause, is 
the unique root node of a resolution subgraph. 

An example of the resolution subgraph is shown in Figure 8.1. The 
leaves of this graph are the clauses of the original formula (represented 
by squares on the left-hand side), and the internal nodes are the conflict 
clauses added during the SAT solving (represented by circles in the mid- 
dle). By traversing this resolution graph from the unique root backward 
to the leaves, one can identify a subset of the original clauses that are 
responsible for this final conflict. This subset of original clauses, called 
the unsatisfiable core [2M03, GNO3], is sufficient to imply unsatisfiabil- 
ity. In Figure 8.1, the conflict clauses that contribute to the final conflict 
are marked gray; the black squares at the left-hand side form the sub- 
set of the original clauses in the unsatisfiable core. The unsatisfiability 
proof includes both the UNSAT core and the conflict clauses involved in 
deriving the final empty clause. 

Because of the connection between the CNF formulae and the model 
(circuit), the unsatisfiable core of an unsatisfiable BMC instance implies 
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Original clauses Conflict clauses Empty clause 





Figure 8.1. Illustration of the resolution graph. 


an abstraction of the model. Let the abstraction be represented as M = 
(V, W, T, T, where V and W are subsets of V and W, respectively. The 
set of initial states and the transition relation of the abstract model, 
denoted by T and T, respectively, are existential abstractions of their 
counter-parts; that is, 


(V) 2 av V V).r(V) , 
T(V,W,^) = 3VVVOV AW), (V'NV^. T(,W, V") . 


In other words, M is constructed from M by removing some state vari- 
ables, inputs, and logic gates. When a state variable v € (V V V) is 
removed, the logic gates that belong to the fan-in cone of v but not 
the fan-in cones of other state variables are also removed from T. Since 
all the clauses of the CNF formula come from the model and the LTL 
property, a subset of these clauses induces a subset of registers, inputs, 
and logic gates of the model. This subformula implicitly defines an ab- 
straction. 

We illustrate this connection using the example in Figure 8.2. The top 
squares in this figure represent the original clauses of the CNF formula, 
and the bottom is one copy of the circuit structure. There are k copies 
of the unrolled circuit structure in a depth-k BMC instance, one for 
each time frame. Assume that the BMC instance is unsatisfiable. The 
black squares represent clauses in the unsatisfiable core. Each clause 
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corresponds to some registers or logic gates of the model. A register is 
considered to be in the abstract model if and only if its present-state or 
next-state variables are in the UNSAT core. A logic gate is considered 
to be in the abstract model as long as any of the clauses describing its 
gate relation appear in the unsatisfiable core. 


Original clauses mM m BW gm m 


Circuit 





Figure 8.2. From unsatisfiable cores to abstractions. 


'The abstract model induced by an UNSAT core as described above 
is an over-approximation of the original model, because the elementary 
transition relations of logic gates not included in the current abstrac- 
tion are assumed to be tautologies. The abstract model simulates the 
concrete model in the sense that, if there is no counterexample of a 
certain length in the abstract model, there is no counterexample of the 
same length in the concrete model. Had one known the current abstract 
model by an oracle, one could have applied this information to speed up 
the solving of the current BMC instance. The idea is to make decisions 
(variable assignments) only on the variables appearing in the current 
abstract model, since by definition, variables and clauses inside the un- 
satisfiable core are sufficient to prove the unsatisfiability of the BMC 
instance. By doing so, we are exploring à much smaller SAT search 
space since only the logic relations among these variables are inspected, 
while the other irrelevant variables and clauses are completely ignored. 
If the size of the abstract model is small (compared to the entire model), 
this restricted SAT search is expected to be much faster. 

Of course, there is no way to know the current unsatisfiable core unless 
one solves the current SAT problem. In practice, however, the series of 
SAT problems produced by BMC for the ever increasing counterexample 
length are often highly correlated, in that their unsatisfiable cores share 
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a large number of clauses. Therefore, abstract models extracted from 
previous unsatisfiable BMC instance is a good estimation of the abstract 
model for the current BMC instance. Also note that in bounded model 
checking, the vast majority of the SAT problems are unsatisfiable. For 
passing properties, all instances are unsatisfiable (i.e., no counterexam- 
ple); for failing properties, all but the last instance are unsatisfiable. 
Therefore, one often has a sufficiently large number of previous abstract 
models for computing an estimation of the current abstraction and for 
refining the "estimation." 

The idea of identifying important decision variables from previous un- 
satisfiable instances and applying them to the current instance is illus- 
trated in Figure 8.3. Each rectangle represents a copy of the transition 
relation of the model for one time frame. The upper part of the figure 
is a BMC instance for the unrolling depth 3, and the lower part is a 
BMC instance for the unrolling depth 4. The shaded area represents 
the unsatisfiable core from the length-3 BMC instance. The dotted line 
indicates the abstraction of the model derived from an UNSAT core of 
the length-4 BMC instance. The UNSAT core for k = 3, for instance, is 
already a good estimation of the UNSAT core for k = 4. Therefore, we 
can record variables appearing in this first UNSAT core and give them a 
higher priority during decision making when solving the length-4 BMC 
instance. 


Previous 
abstract 
model 


k=4 


Current 


abstract s 


model 





Figure 8.9. Previous abstractions to help solving the current BMC instance. 


In the best-case scenario, i.e., the estimation is perfect and the pre- 
vious abstraction is already sufficient for proving the unsatisfiability of 
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the current SAT instance, no variable other than those in previous un- 
satisfiable cores needs to be assigned before the SAT solver stops and 
reports UNSAT. Even if there are some discrepancies between the esti- 
mation and the reality, we still expect a significant reduction in the size 
of the SAT search tree by making decisions on the variables of previ- 
ous abstract models first. This predicted variable decision ordering can 
also help when the current SAT instance is indeed satisfiable, since un- 
interesting part of the search space will be quickly identified and pruned 
away through the addition of conflict clauses. 


8.2 Refining The Decision Ordering 


All the previous unsatisfiable BMC instances are used to predict the 
variable decision ordering for the current instance. We can assign a 
score to each Boolean variable of the SAT formula. Variables appearing 
in previous unsatisfiable cores are assigned higher scores. The more 
frequent a variable appears in previous UNSAT cores, the higher its 
score is. These scores are combined together in solving the current BMC 
instance, so that variables with higher scores are given higher priorities 
in the decision-making. 

The augmented bounded model checking algorithm is presented in 
Figure 8.4. The new procedure, called REFINEORDERBMc [WJHS04], 
accepts two parameters: the model M and the invariant predicate P. 
List var Rank is used to store the scores of variables appearing in previ- 
ous unsatisfiable cores. Integer k is the current unrolling depth. Proce- 
dure GENCNFFORMULA generates the CNF representation of the length- 
k BMC instance. The satisfiability of F is decided by the SAT procedure 
SATCHECK, which is a Chaff-like SAT solver that also takes the prede- 
termined ordering var Rank as a parameter. Note that for the formula 
F, var Rank is often a partial ordering, since it may not have all the 
Boolean variables of F. 

When F is unsatisfiable, SATCHECK computes the UNSAT core and 
returns all the variables appearing in it. This set of variables, denoted 
by unsatVars, is used to update var Rank. 'The heuristic used inside 
UPDATERANKING to update the variable ranking will be explained later. 
After the unrolling depth k is increased, the updated ordering is applied 
to SATCHECK again. The entire BMC procedure terminates as soon 
as F' becomes satisfiable, in which case the property G P is proved to 
be false, or the unrolling depth k exceeds a predetermined completeness 
threshold, in which case the property is declared true. 

Inside the procedure UPDATERANKING, all the Boolean variables that 
have ever appeared in any previous unsatisfiable cores are assigned non- 
zero scores. In this scoring scheme, all previous unsatisfiable cores are 
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REFINEORDERBMO (M, P) 

{ 
Initialize the list var Rank; 
for (each k € N) 


F = GENCNFFORMULA (M, P,k); 
(isSat, unsatVars) = SATCHECK (F, var Rank); 
if (isSat) 
return FALSE; 
else 
UPDATERANKING (unsatVars, var Rank); 


return TRUE; 
} 


Figure 8.4. Refining the SAT decision order in bounded model checking. 


used to determine the current variable ordering, but we give a larger 
weight to the latest UNSAT cores. Let bmc.score(z) be the score for 
the variable x; then we have 


bmc_score(x) = > IN-UNSAT(z,]) Xj , 
1<j<k 


where k is the current unrolling depth, and IN-UNSAT(z, j) returns 1 
if variable z appears in the unsatisfiable core of j-th BMC instance, 
and returns 0 otherwise. The ranking of these variables is based on 
the bmc.score—the one with a higher score gets the higher priority. 
This heuristic algorithm is based on the following two observations: (1) 
one wants to give preference to the variables appearing in the most 
recent unsatisfiable cores, because they usually have higher correlation 
to the current one; and (2) one wants to avoid relying exclusively on any 
particular unsatisfiable core, because it may not always be an accurate 
estimation of the current one. 

In order to generate the unsatisfiable core after the SAT solver re- 
ports UNSAT, additional bookkeeping is required during the SAT solv- 
ing process. In particular, for every conflict clause (new clause learned 
from a conflict), its complete conflict graph must be recorded to memo- 
rize all the clauses that are responsible for it. Since some of these clauses 
may be conflict clauses themselves, at the end, one may have a Conflict 
Dependency Graph (CDG) [CCK*02], in which one conflict graph de- 
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pends on another. By definition, a CDG is a directed acyclic graph. In 
the presence of a CDG, the unsatisfiable core can be easily identified 
by traversing the CDG from the final conflict backward to the original 
clauses. 

To maintain a CDG, we need to record all the conflict clauses added 
throughout the SAT search. Although some of these clauses may not be- 
long to the UNSAT core, there is no way of identifying them in advance. 
However, many modern SAT solvers, including Chaff, have a feature of 
periodically removing conflict clauses that are regarded as irrelevant (or 
less relevant) to the current search. For example, if a conflict clause has 
not been used for a considerably long time, it will be considered as irrel- 
evant and will be deleted from the clause database. This heuristic can 
reduce the total number clauses that need to be inspected during BCP. 
Disabling this feature may slow down the SAT solver significantly when 
solving difficult SAT problems. On the other hand, if conflict clauses 
are allowed to be deleted as described above, the dependency relation in 
the CDG may be broken, which makes the construction of a complete 
unsatisfiable core impossible. 

In order to generate a complete unsatisfiable core without slowing 
down the search at the same time, we choose to maintain separately a 
simplified version of the CDG. Our observation is that the details of the 
conflict clauses are not needed in the CDG. For the purpose of identify- 
ing the unsatisfiable core, which is a subset of the original clauses, only 
the dependency relation of the conflict clauses is required. "Therefore, 
our simplification is mainly on representing the conflict clause—instead 
of recording both the literals and the depended clauses, we replace each 
conflict clause by a pseudo clause ID and retain only the dependency 
relation between the clause IDs. The use of a separate simplified CDG 
leaves the original clause database intact. Therefore, the periodic re- 
moval of irrelevant conflict clauses is not affected. Compared to the 
number of literals in a conflict clause, which is typically 100-200, the 
overhead of using an integer for the pseudo ID is small. 

In practice, the additional overhead of maintaining and finally travers- 
ing the simplified CDG is relatively low. In our controlled experiments, 
we have found that the additional runtime is about 596 of the total run 
time, and the memory overhead is often negligible. 

Applying previously computed variable decision ordering to the suc- 
cessive SAT calls requires a slight modification of the SAT solver. Since 
SAT solvers are different, the actual modification depends on the variable 
decision making algorithm of each individual SAT solver. The following 
discussion is for the SAT solver Chaff [MMZt01]; however, we note that 
the proposed method can be easily adapted to other DLL based SAT 
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solvers. Chaff's default variable decision heuristic is VSIDS (for Vari- 
able State Independent Decaying Sum), which goes as follows: Every 
literal I is associated with a score, denoted by chaff. score(D, whose 
initial value is the number of clauses of the original formula in which 
l appears, i.e., the literal counts. Every a certain number of decisions, 
chaff.score(l) will be updated as follows: 


chaff score(l) = chaff.score(l)/2 + new literal counts(l) , 


where new literal .counts(/) is the number of new conflict clauses con- 
taining literal J. All the variables are sorted periodically by chaff.score. 
When it is the time to make a decision on free variables, the vari- 
able / with the highest score will be selected. Depending on whether 
chaff.score(l) is larger than chaff.score(l), the variable will be as- 
signed either 1 or 0. 

The pre-computed score bmc.score, in principle, can either replace or 
can be combined with chaff.score to determine the final decision order- 
ing. However, relying exclusively on bmc.score may not be practical in 
all cases, because the score is available only for a (usually small) subset 
of variables, and it is for variables instead of the two phases of the same 
variable—both the positive and negative phase of a variable have the 
same bmc.score. Therefore, we choose to combine it with chaff score 
in the decision making, instead of using it as the only criterion. 

Two different ways of combining the two types of scores are possible: 
One is called the static configuration, and the other is called the dynamic 
configuration. In both approaches, chaff.score is updated as usual and 
sorting of free variables inside the SAT solver is performed periodically. 
However, sorting in the static configuration is primarily by bmc.score, 
with chaff.score only as a tiebreaker. It is called static because the 
sorting criteria are fixed throughout the entire SAT solving process. 

In the dynamic configuration, the periodic sorting is initially based 
primarily on bmc.score with chaff.core as a tiebreaker. However, if 
the estimation (of the abstract model) is found to be inaccurate, the 
SAT solver can automatically switch back to the default VSIDS heuris- 
tic, which sorts exclusively by chaff.core. The rationale behind this 
approach is that, by starting with the ranking by bmc.score, we can 
quickly learn important clauses and prune away a significant portion 
of the search space early on. On the other hand, the VSIDS heuris- 
tic is designed to favor the most recently added conflict clauses, which 
may eventually dominate in terms of literal counts for difficult problems. 
Applying VSIDS heuristic in those cases allows the search process to be 
driven primarily by recent conflict clauses. 
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The SAT problem is considered difficult when either the estimation 
of the unsatisfiable core is not accurate, or proving the unsatisfiability 
indeed needs almost all the variables. In both cases, the number of 
decisions required to solve the problem is often large; therefore, we can 
use the number of decisions to predict whether the problem is difficult. 
In the implementation of the dynamic configuration, we switch back to 
the VSIDS heuristic as soon as the number of decisions is greater than 
1/64 of the number of original literals. (This heuristic threshold was 
determined by empirical studies.) 


8.3 Experimental Analysis 


We have implemented the REFINEORDERBMC procedure on top of 
the bounded model checking procedure in VIS-2.0 [B*96, VIS]. The 
back-end SAT solver is Chaff [MMZ*01]. The BMC command in VIS 
is based on the basic encoding of BMC as in [BCCZ99] and the basic 
induction proof as in [SSS00]. Experimental studies were conducted 
on the set of IBM Formal Verification Benchmark circuits (IBM], each 
with an invariant property G P. The experiments were performed on a 
400MHz Pentium II with 1GB of RAM running Linux, with the time out 
limit set to 2 hours. In our experiments, the only difference between the 
standard BMC command and REFINEORDERBMC is their SAT variable 
decision orderings. Trivial experiments that can be finished by both 
methods within 10 seconds were excluded. 

Table 8.1 compares the CPU time of the new method (with both 
static and dynamic configurations) to the standard BMC command in 
VIS. The first column is the name of the model. The second column 
indicates whether the given property is true or false. If the experiments 
cannot be finished within 2 hours, we compare the CPU time taken up 
to the maximum unrolling depth that all methods can reach; in those 
cases, the maximum unrolling depth is given in the parenthesis. The 
next three columns give the CPU time of the standard BMC and the 
new method with both static and dynamic configurations. 

The last two rows of Table 8.1 give the corresponding total CPU time, 
and the overall speedup of the new methods over the standard BMC. The 
overall speedup of REFINEORDERBOM with the static configuration is 
38%; the overall speedup of REFINEORDERBMC with the dynamic con- 
figuration is 42%. Out of the 37 circuits, the new method has achieved 
performance gains on 26 (for static) and 32 (for dynamic ) circuits. The 
same results are also given in the scatter plots in Figure 8.5. Note that 
dots that are under the diagonals represent the wins by the new method. 

Figures 8.6 and 8.7 show the detailed information from the SAT solver 
while it is solving the BMC instances of the example circuit 02_3_batch_2. 
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Table 8.1. Comparing BMC with and without refining the SAT decision order. 






























































True/False BMC time refine_order.bmc time 

Model | or (k) (s) static (s) dynamic (s) 
01_batch F 39 24 
02_1 batch 2 (28) 835 894 
02_3_batch_2 (65) 494 476 
02-3 batch 4 435 DE 
02-3-batch-6 368 
03-batch 222 238 
Od-batch 67 
06_batch 962 589 596 
11_batch_2 (29) 3820 4533 2932 
RUE (8) tico mis 
L-batch-1 287 
14_batch_2 F 35 30 35 
15_batch F 12 13 12 
16-1 batch (83) 6948 2256 4537 
17-1 batch. 6965 
Ibach (i2) i 
T7-2-baich.1 331 4629 
172_batch-2 (141) 7181 3268 
18_batch (20) 1172 1049 
19_batch F 139 108 
20_batch (28) 3748 5617 3992 
beh ; Ñ — 
22 batoh 3986 
25 batch 3644 
24-1 baich-I 1182 
24-Lbatch-2 TI 1053 
24.1 batch.3 5075 782 1054 
25. batch 3069 2922 
27 batch 37 
28 batch 683 
29-batch 4270 
3L-Lbateh-1 4491 
SI-L-batch.2 (i) 3552 
31-1 batch 3 3748 
312 batch- 2600 
31.2. batch.2 (19) 6924 3180 5475 
Total 138,632 86,212 79,021 


Percentage 100% 62% 57% 
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Figure 8.5. 
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Scatter plots: plain BMC vs. BMC with the refined ordering. 
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Figure 8.6. Reduction of the size of decision trees on Circuit 02.3 latch. 2: plain 
BMC vs. BMC with the refined ordering. 


150 



















































Number of Contlict Clauses 
1.E+05 —— — _ 
1.E+04 
1.E+03 plc nl et 
aestuat a| 
— w a... V di 
1.E+02 y ae 
LE+01 ~~ BMC 
|» ref. ord BMC; 
1.E+00 wakra rere perrieri CREE T v mccum ror] 
1 11 21 31 41 51 61 
Number of Implications 
1.E+08 - 
LE407 |- 
1,E+06 - 
1.E*05 
L.E+04 - 
1.E+03 |- TORT HICEDIMEP IUIS PERTE ES — BMC » -| 
m ref ord BMC} | 
H 
LE402 1: | 








1 11 21 31 41 51 61 


Figure 8.7. Reduction of the number of conflicts and implications on Circuit 
02.3 latch.2: plain BMC vs. BMC with the refined ordering. 
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In all these figures, the horizontal axis represents the different BMC 
unrolling steps. The two figures in Figure 8.6 compare plain BMC with 
REFINEORDERBMC (static) on the “maximum decision level” and the 
“number of decisions." The two figures in Figure 8.7 compare plain 
BMC with REFINEORDERBMC on the “number of conflict clauses" and 
the *number of implications." 

With the help of the predicted variable decision ordering, the sizes 
of the SAT search trees have been significantly reduced, as shown by 
Figure 8.6. At the right-hand side of the figures (when BMC depths are 
large), the reductions can be up to two orders of magnitude. In addition, 
the number of conflicts and the number of implications are also reduced 
significantly, as shown in Figure 8.7. These reductions in turn translate 
into shorter CPU times. 


8.4 Further Discussion 


We have presented a new algorithm for predicting and successively 
refining the variable decision ordering for SAT problems encountered 
in bounded model checking. The algorithm is based on the analysis 
of the unsatisfiability cores of previous BMC instances. We have de- 
scribed both the static and dynamic configurations in applying this new 
ordering to the decision making inside SAT solvers, by using the SAT 
solver Chaff as an example. Our experiments conducted on industrial 
designs have showed that the new method significantly outperforms the 
standard BMC. Further experimental analysis has also indicated that 
the performance improvement is due to the reduction of the sizes of the 
SAT search trees. 

The proposed algorithm exploits the unique characteristic of SAT 
based bounded model checking: the different SAT problems are highly 
correlated. It complements existing decision heuristics of the SAT solvers 
used for BMC. We believe that the same idea is also applicable to SAT 
based problems other than bounded model checking, as long as their 
subproblems have a similar incremental nature. 

Tuning the SAT solver for BMC was first studied by Shtrichman in 
[Sht00], where a predetermined variable ordering was extracted by tra- 
versing the variable dependency graph in a topological order. The entire 
BMC instance can be regarded as a large combinational circuit lying on 
a plane in which the horizontal axis represents different time frames and 
the vertical axis represents different registers (or state variables). By 
either forward or backward traversal of the variable dependency graph, 
Shtrichman sorted the SAT variables according to their positions on 
the "time axis". In contrast, our new method sorts the SAT variables 
according to their positions on the other axis—the “register axis." 
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Information from the circuit structure was also used in previous work 
to help the SAT search. In [GAGt02], Ganai et al. proposed a hy- 
brid representation as the underlying data structure of their SAT solver. 
Both circuits and CNF formulae were included in order to apply fast 
implication on the circuit structure and at the same time retain the 
merit of CNF formulae. In [GGW*03b], Gupta el al. applied implica- 
tions learned from the circuit structure (statically and dynamically) to 
help the SAT search, where the implications were extracted by BDD 
operations. In [[WCHO3], Lu et al. proposed to use circuit topological 
information and signal correlations to enforce a good decision ordering 
in their circuit SAT solver. The correlated signals of the underlying cir- 
cuit were identified by random simulation, and were then applied either 
explicitly or implicitly to the SAT search. In their explicit approach, the 
original SAT problem was decomposed into a sequence of subproblems 
for the SAT solver to solve one-by-one. In their implicit approach, cor- 
related signals were dynamically grouped together in such a way that 
they were most likely to cause conflicts. 

The incremental nature of the BMC instances was also exploited by 
several incremental SAT solvers [WKSO01, ES03]. These works focused 
primarily on how to incrementally create a SAT instance with as lit- 
tle modification as possible to the previous one, and on how to re-use 
previously learned conflict clauses. However, refining the SAT decision 
ordering has not been studied in these incremental solvers. Therefore, 
the method proposed in this chapter can be combined with the incre- 
mental SAT techniques to further improve their performance. 


Chapter 9 


CONCLUSIONS 


The purpose of the research described in this book is to apply model 
checking techniques to the verification of large real-world systems. We 
believe that automatic abstraction is the key to bridge the capacity 
gap between the model checkers and industrial-scale designs. The main 
challenge in abstraction refinement is related to the ability of reaching 
the optimum abstraction, i.e., a succinct abstraction of the concrete 
model that decides the given property. In this book, we have proposed 
several fully automatic abstraction refinement techniques to efficiently 
reach or come near the optimum abstraction efficiency. 


9.1 Summary of Results 


In Chapter 3, we have proposed a new fine-grain abstraction approach 
to push the granularity of abstraction refinement beyond the usual state 
variable level. By keeping the abstraction granularity small, we add 
only the information relevant to verification into the abstract models 
at each refinement iteration step. Our experience with industrial-scale 
designs shows that fine-grain abstraction is indispensable in verifying 
large systems with complex combinational logic. 


In Chapter 4, we have proposed a new generational refinement al- 
gorithm GRAB, in which we use a game-based analysis of all shortest 
counterexample in the SORs to select refinement variables. By sys- 
tematically analyzing all the shortest counterexamples, GRAB identifies 
important refinement variables from the local support of the current ab- 
stract model. The global guidance from all shortest counterexamples 
and the scalable refinement variable selection computation are critical 
for GRAB to achieve a higher abstraction efficiency. Compared to pre- 
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vious single counterexample guided refinement methods, GRAB often 
produces a smaller final abstract, model with less run time. 


In Chapter 5, we have proposed the DNC compositional SCC analysis 
algorithm which quickly identifies uninteresting parts of the state space 
of previous abstract, models and prune them away before going to the 
next abstraction level. We also exploit the fact that the strength of an 
SCC or a set of SCCs decreases monotonically with refinement, by tailor- 
ing the model checking procedure to the strength of the SCC at hand. 
DNC is able to achieve a speed-up of up to two orders of magnitude 
over standard symbolic fair cycle detection algorithms, indicating the 
effectiveness of reusing information learned from previous abstractions 
to help the verification at the current level. 


In Chapter 6, we have proposed a state space decomposition algo- 
rithm in which the SCC quotient graph of an abstract model is used to 
disjunctively decompose the concrete state space. A nice feature of this 
decomposition is that we can perform fair cycle detection in each state 
subset in isolation without introducing inconclusives. We have also pro- 
posed a new guided search algorithm for symbolic fair cycle detection 
which can be used at the end of the adaptive popcorn-line policy in the 
DNC framework. Our experiments show that for large systems or other- 
wise difficult problems, heavy investment in disjunctive decomposition 
and guided search can significantly improve the performance of DNC. 


In Chapters 7 and 8, we have proposed two new algorithms to improve 
the performance of BDD-based symbolic image computation and the 
Boolean SAT check in the context of bounded model checking. The two 
decision procedures are basic engines of most symbolic model checking 
algorithms; for both of them, we have applied the general idea of abstrac- 
tion followed by successive refinements. In Chapter 7, we first compute 
a set of over-approximated images and apply them as care sets to the 
far side of the transition relations; the minimized transition relation is 
then used to compute the exact image. In Chapter 8, we first predict 
a good variable decision ordering based on the UNSAT core analysis of 
previous BMC instances, and then use the new ordering to improve the 
SAT check of the current instance. Our experiments on a set of indus- 
try benchmark designs show that the proposed techniques can achieve a 
significant performance improvements over the prior art. 


In conclusion, the new algorithms presented in this book have signifi- 
cantly advanced the state of the art for model checking. With automatic 
abstraction, model checking has been successfully applied to hardware 
systems with more than 4000 binary state variables. Evidence has shown 
that the advantages of these new techniques generally increase as the 
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models under verification get larger. Therefore, these techniques will 
play important roles in verifying industrial-scale systems. 


We regret not giving more results on benchmark examples with more 
than 1000 state variables. At the time this research was conducted, there 
were not many industrial-scale verification benchmarks in the public 
domain (and there afe not many even now). Most companies in the 
computer hardware design and EDA industry are reluctant to share 
their designs with university investigators. We sincerely hope that this 
situation will change in the future. 


9.2 Future Directions 


In this book, the quest for optimum abstraction has been put into 
a synthesis perspective, where refinement is regarded as a process of 
synthesizing the smallest deciding abstract model. An interesting open 
question is related to the theoretical complexity of finding the optimum 
abstraction. Although finding the optimum abstraction is at least as 
hard as model checking itself, understanding the theoretical aspect of 
this problem may shed light on designing more practical algorithms. 
According to our own experience in design and analysis of VLSI/CAD 
algorithms, good practical algorithms often come from formulating an 
optimal algorithm and then making intuitive simplifications to deal with 
complexity in practice. 

The GRAB refinement algorithm relies on the analysis of all the short- 
est counterexamples. An interesting extension is to find, among all the 
shortest counterexamples, the one counterexample with the maximum 
likelihood, and apply the same variable selection heuristic to this max- 
imum likelihood path. Note that when a property fails in an abstract 
model, the abstract counterexamples in the synchronous onion rings may 
have different concretization probabilities. When the property is mostly 
likely to be false in the concrete model, it may be advantageous to focus 
on those counterexample that are most likely to be concretizable. 


In this book, we rely primarily on BDD-based symbolic fixpoint com- 
putation to check whether the given property holds in an abstract model. 
Recent advances in SAT algorithms have demonstrated the possibility 
of replacing BDDs with CNF formulae in fixpoint computation [ABEOO, 
GYAG00, WBCGO00, McM02]. The analysis of abstract models can also 
be performed by using BMC induction proof techniques; in [LWS03, 
LWSO5], Li et al. have shown that SAT-based decision procedures often 
complement BDD based symbolic fixpoint computation in the analysis 
of the abstract models. Therefore, the integration of BDD-based algo- 
rithms with SAT-based algorithms for the analysis of the abstract mod- 
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els should lead to a more robust and powerful approach to abstraction 
refinement. 

In the more general area of symbolic model checking, research in at- 
tacking the capacity problem can be classified into two categories. One 
is at the “lower level," which includes the improvement of both runtime 
and memory performance of basic decision procedures. The other is at 
the "higher level," which includes methods like abstraction refinement, 
compositional reasoning [AL91, Var95, McM97, McM98, HQR98], sym- 
metry reduction [ES93, ID93], etc.. There will be improvements on the 
lower level algorithms in the years to come; however, we believe that to 
bridge the existing verification capacity gap, the bulk of the improve- 
ment has to come from advances in the higher level techniques. An 
interesting future research direction is to apply techniques developed in 
the abstraction refinement framework to compositional reasoning and 
symmetry reduction. These methods share a common idea—simplifying 
the model before applying model checking to it, even though the meth- 
ods of simplification are significantly different. 
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